midscene-runner
Original:🇨🇳 Chinese
Translated
Execute, validate, and debug Midscene YAML automation files. Handles dry-run, execution, report analysis, and iterative debugging.
4installs
Added on
NPX Install
npx skill4agent add lee-117/midi-stagehand-skill midscene-runnerTags
Translated version includes tags in frontmatterSKILL.md Content (Chinese)
View Translation Comparison →Trigger Conditions
Used when users need to execute, debug, or validate Midscene YAML files.
Common trigger phrases:
- "Run this YAML"
- "Execute XXX.yaml"
- "Test this automation script"
- "Run this test case"
- "Validate if this YAML is correct"
- "Debug this automation flow"
- "Batch execute these tests"
English trigger phrases:
- "Run this YAML"
- "Execute XXX.yaml"
- "Test this automation script"
- "Validate this YAML file"
- "Debug this automation flow"
- "Run the test cases"
Workflow
Step 0: Environment Check
Before the first execution, use the one-click health check to confirm the runtime environment is ready:
bash
node scripts/health-check.jsThis script checks: Node.js version, dependency installation, CLI scripts, , runtime, AI model configuration, Chrome browser.
@midscene/webtsxModel not configured? Midscene requires a vision-language model to perform AI operations. Create a file in the project root directory:
.envenv
MIDSCENE_MODEL_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
MIDSCENE_MODEL_API_KEY=sk-your-key
MIDSCENE_MODEL_NAME=qwen-vl-max-latestFor detailed configuration instructions, see Midscene Model Configuration Documentation.
First-time use? Run one-click environment initialization:
bash
npm run setupsetup- Intelligently detect the network environment and automatically select the fastest npm mirror (Taobao mirror is used by default in China for acceleration)
- Install all project dependencies
- Preheat and
@midscene/webto the npx cache (avoid waiting for downloads during first execution)tsx - Detect system Chrome; if not found, automatically download Chromium
- Output environment readiness report
Chrome Browser Detection (Web Platform):
The framework will automatically search for system Chrome in the following order:
- Windows:
Program Files\Google\Chrome\Application\chrome.exe - macOS:
/Applications/Google Chrome.app/Contents/MacOS/Google Chrome - Linux: ,
/usr/bin/google-chrome/usr/bin/chromium
If system Chrome is not found, there are two solutions:
- Install Chrome browser (recommended)
- Run to install Chromium
npx puppeteer browsers install chrome
If Chrome is in a non-standard path, set the environment variable:
bash
# Linux/macOS
export PUPPETEER_EXECUTABLE_PATH="/path/to/chrome"
# Windows PowerShell
$env:PUPPETEER_EXECUTABLE_PATH="C:\path\to\chrome.exe"Platform-Specific Prerequisites:
| Platform | Dependencies |
|---|---|
| Web | Chrome/Chromium browser (auto-detected, see above instructions) |
| Android | ADB-connected device (verify with |
| iOS | WebDriverAgent configured |
| Extended mode | |
If dependencies are missing, prompt the user to install:
bash
npm install && npm run setupStep 1: Locate YAML File
Determine the path of the YAML file to execute. If the user does not specify the full path:
- Check the most recently generated files in the directory
./midscene-output/ - Check template files in the directory
./templates/ - Check /
.yamlfiles in the current directory.yml - Prompt the user to provide the file path
Multi-file scenario: If the user requests to execute multiple files, execute them one by one and summarize the results.
Step 2: Pre-Validation
Before execution, call the validator to check the YAML file:
bash
node scripts/midscene-run.js <yaml-file> --dry-runIf validation fails, analyze the error cause and suggest fixes to the user:
| Common Error | Cause | Fix Suggestion |
|---|---|---|
| YAML syntax error | Incorrect indentation or formatting issues | Check indentation, use 2 spaces consistently |
| Missing platform configuration | No web/android/ios/computer configuration | Add |
| Missing tasks | No tasks defined | Add tasks array and flow |
| Undeclared engine | Used superset keywords but not marked | Add |
| Undefined variable | Referenced an undeclared | Declare variables in the variables section |
| Undeclared features | Extended mode did not list used features | Add |
| Missing required fields in loop | repeat lacks count, while lacks condition | Add the corresponding required fields |
| Imported file does not exist | Incorrect path referenced in import | Check if the file path is correct |
Auto-Fix Flow: If the error can be fixed automatically (e.g., missing engine declaration), fix it directly and re-validate to avoid round-trip confirmation.
Step 3: Execution
Choose the execution method based on the project environment:
Method 1 (Recommended): Use Project CLI
If the project has (complete midi-stagehand-skill project):
scripts/midscene-run.jsbash
# Single file execution
node scripts/midscene-run.js <yaml-file> [options]
# Batch execution (glob pattern)
node scripts/midscene-run.js "tests/**/*.yaml"Method 2: Use Midscene CLI Directly
If in an external project (without ), use directly:
scripts/midscene-run.js@midscene/webbash
# Installation (first time only)
npm install @midscene/web dotenv
# Execute with UI (recommended for debugging)
npx @midscene/web <yaml-file> --headed
# Batch execution (official CLI option)
npx @midscene/web "tests/**/*.yaml" --concurrent --continue-on-errorNote: The package name is(not@midscene/web). The official CLI syntax is@midscene/cli, which supports thenpx @midscene/web <yaml-file>subcommand (run); both forms are acceptable.npx @midscene/web run <yaml-file>
Available Options (Method 1):
- — Only validate and transform, do not execute (Note: Does not detect model configuration; AI operations require
--dry-runto be configured)MIDSCENE_MODEL_API_KEY - — Execution timeout (default 300000 = 5 minutes). Increase this value for long-running automation scenarios
--timeout <ms> - — Save the transformed TypeScript file (Extended mode only). When troubleshooting transpilation errors, it is recommended to use with
--output-ts <path>--dry-run - — Report output directory (default
--report-dir <path>)./midscene-report - — Select TS template (default puppeteer; playwright is suitable for scenarios requiring multi-browser compatibility)
--template puppeteer|playwright - /
--verbose— Show detailed output (validation details, detection information, environment information)-v - /
--help— Show help information-h
Extended Mode Execution Flow:
- YAML → Transpiler → TypeScript
- TypeScript → tsx runtime → Playwright + Midscene SDK
- Generate execution report
You can use to save the intermediate TypeScript file for debugging:
--output-tsbash
node scripts/midscene-run.js test.yaml --output-ts ./debug-output.tsStep 4: Analyze Results
After execution completes:
Success
- Report execution summary (number of passed/failed tasks)
- Inform the user of the report file location
- If there are results, display the extracted data
aiQuery - If is exported, confirm the file generation location
output
Failure
Analyze errors and fix them according to the following decision tree:
What does the error message contain?
├─ "API key" / "401" / "Unauthorized"
│ → Model not configured. Set the MIDSCENE_MODEL_API_KEY environment variable or .env file
│
├─ "Timeout" / "exceeded"
│ ├─ Can the page open normally in the browser?
│ │ ├─ Yes → Page loads slowly, increase the timeout value (e.g., timeout: 30000)
│ │ └─ No → Check if the URL is correct and network is reachable
│ └─ Occurs in aiWaitFor? → Condition description may be inaccurate, check assertion text
│
├─ "Element not found" / "not found"
│ ├─ Failed on first attempt? → AI description is not precise enough, use more specific text descriptions
│ ├─ Succeeded before? → Page structure may have changed, compare with report screenshots
│ └─ Still failing? → Try deepThink: true or use xpath positioning instead
│
├─ "Assertion failed"
│ → View report screenshots, compare actual page state vs expected description, adjust aiAssert text
│
├─ "Navigation failed" / "net::ERR_"
│ → Check URL protocol (https://) and accessibility
│
├─ "Transpiler error"
│ → Use --dry-run --output-ts ./debug.ts to view generated code and troubleshoot syntax issues
│
├─ "Permission denied"
│ → Page requires login or special permissions, add login steps or cookie configuration
│
└─ "javascript" step error
→ Check JS code syntax, note API differences between browser environment and Node environmentIterative Fix Flow:
- Analyze error cause
- Modify the YAML file
- Re-run for validation
--dry-run - Re-execute after validation passes
Step 5: Report Interpretation
Interpret the report generated by Midscene:
- Reports are saved in the directory by default
./midscene-report/ - HTML Report: Open in a browser, each step shows execution status and screenshots (green ✓ = passed, red ✗ = failed), click to expand details
- JSON Report: Structured data containing the status, duration, and screenshot path of each step, suitable for automatic parsing in CI/CD
- Screenshot paths are relative to the report directory
- Custom screenshots generated by the step are also included in the report
recordToReport
Report summary format:
Total : N
Passed: N
Failed: N
Status: passed|failedQuick Execution Command Reference
Using Project CLI (Complete Project)
bash
# Basic execution
node scripts/midscene-run.js test.yaml
# Validate only, no execution
node scripts/midscene-run.js test.yaml --dry-run
# Save generated TS (Extended mode only)
node scripts/midscene-run.js test.yaml --output-ts ./output.ts
# Use Playwright template
node scripts/midscene-run.js test.yaml --template playwright
# Specify report directory
node scripts/midscene-run.js test.yaml --report-dir ./reports
# Set timeout to 10 minutes
node scripts/midscene-run.js test.yaml --timeout 600000
# Validate + save TS (troubleshoot transpilation issues)
node scripts/midscene-run.js test.yaml --dry-run --output-ts ./debug.ts
# View help
node scripts/midscene-run.js --helpUsing Midscene CLI Directly (External Project)
bash
# Execute with UI (recommended for debugging)
npx @midscene/web test.yaml --headed
# Headless execution (recommended for CI/CD)
npx @midscene/web test.yamlYAML Configuration Quick Reference
Agent Configuration (Optional)
yaml
agent:
testId: "test-001"
groupName: "Regression Test"
groupDescription: "Daily regression test suite"
cache: truecontinueOnError (Optional)
Continue executing subsequent tasks after a task fails:
yaml
tasks:
- name: Task A
continueOnError: true
flow: [...]
- name: Task B (executes even if A fails)
flow: [...]Platform Configuration Options
yaml
# Full Web platform configuration
web:
url: "https://example.com"
headless: false # true = headless mode (suitable for CI/CD); false = UI mode (suitable for debugging)
viewportWidth: 1920 # Default 1280; use 375 for mobile simulation
viewportHeight: 1080 # Default 720; use 667 for mobile simulation
userAgent: "Custom User Agent"
waitForNetworkIdle:
timeout: 2000
continueOnNetworkIdleError: true
# Android platform (ensure device is connected via adb first)
android:
deviceId: "emulator-5554" # Device ID from adb devices output
# Use launch: "com.example.app" in flow to start the app
# iOS platform (WebDriverAgent must be configured first)
ios:
wdaPort: 8100 # WebDriverAgent port
wdaHost: "localhost" # WebDriverAgent host
# Use launch: "com.example.app" in flow to start the appDebugging Tips
- View Report Screenshots: Check the HTML report after execution, each step has a screenshot
- Execute in Segments: Verify the first few steps first, then add more steps gradually
- Add Waits: Add after key steps to ensure the page is ready
aiWaitFor - Insert Assertions: Insert in intermediate steps to verify current state
aiAssert - View TS Code: Use in Extended mode to view generated code for troubleshooting
--output-ts - Use deepThink: Enable when element positioning is inaccurate
deepThink: true - Downgrade to xpath: Use xpath for precise selection when natural language positioning fails
- Use javascript: Execute JS code directly via step to debug page state
javascript - Use recordToReport: Insert at key nodes to capture screenshots for records
recordToReport
Notes
- Chrome/Chromium browser is required to execute Web platform tests
- Dependencies may need to be installed on first run:
npm install - Android tests require an ADB-connected device, iOS tests require WebDriverAgent
- YAML in Extended mode is first converted to TypeScript before execution, requiring the tsx runtime
- Screenshot paths in reports are relative, look within the report directory
- If you need to generate a new YAML file, you can use the Midscene YAML Generator skill
- Environment variables are passed via system environment or file, referenced in YAML using
.envor${ENV:NAME}(both syntaxes are equivalent)${ENV.NAME} - branches run in independent browser contexts and do not affect each other during execution;
parallelresults from each branch can be accessed together after all branches complete (viaaiQuery)merge_results: true - only checks YAML syntax and structure, does not detect model configuration or network reachability
--dry-run - If does not detect existing updates, the lock file format may be outdated (v1); reinstall to upgrade to v3 format:
npx skills checknpx skills add https://github.com/lee-117/midi-stagehand-skill -a claude-code
Collaboration Agreement
When collaborating with Generator Skill:
- Priority Check: Check the most recently generated files in the directory first
./midscene-output/ - On Execution Failure: Provide structured error information:
- Error type, error location (step/line number), suggested fix
- If the error can be fixed by modifying the YAML file, modify it directly and re-execute (no need to callback Generator)