midscene-runner

Original🇨🇳 Chinese
Translated

Execute, validate, and debug Midscene YAML automation files. Handles dry-run, execution, report analysis, and iterative debugging.

4installs
Added on

NPX Install

npx skill4agent add lee-117/midi-stagehand-skill midscene-runner

Tags

Translated version includes tags in frontmatter

SKILL.md Content (Chinese)

View Translation Comparison →

Trigger Conditions

Used when users need to execute, debug, or validate Midscene YAML files.
Common trigger phrases:
  • "Run this YAML"
  • "Execute XXX.yaml"
  • "Test this automation script"
  • "Run this test case"
  • "Validate if this YAML is correct"
  • "Debug this automation flow"
  • "Batch execute these tests"
English trigger phrases:
  • "Run this YAML"
  • "Execute XXX.yaml"
  • "Test this automation script"
  • "Validate this YAML file"
  • "Debug this automation flow"
  • "Run the test cases"

Workflow

Step 0: Environment Check

Before the first execution, use the one-click health check to confirm the runtime environment is ready:
bash
node scripts/health-check.js
This script checks: Node.js version, dependency installation, CLI scripts,
@midscene/web
,
tsx
runtime, AI model configuration, Chrome browser.
Model not configured? Midscene requires a vision-language model to perform AI operations. Create a
.env
file in the project root directory:
env
MIDSCENE_MODEL_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
MIDSCENE_MODEL_API_KEY=sk-your-key
MIDSCENE_MODEL_NAME=qwen-vl-max-latest
For detailed configuration instructions, see Midscene Model Configuration Documentation.
First-time use? Run one-click environment initialization:
bash
npm run setup
setup
will automatically complete the following tasks:
  • Intelligently detect the network environment and automatically select the fastest npm mirror (Taobao mirror is used by default in China for acceleration)
  • Install all project dependencies
  • Preheat
    @midscene/web
    and
    tsx
    to the npx cache (avoid waiting for downloads during first execution)
  • Detect system Chrome; if not found, automatically download Chromium
  • Output environment readiness report
Chrome Browser Detection (Web Platform):
The framework will automatically search for system Chrome in the following order:
  • Windows:
    Program Files\Google\Chrome\Application\chrome.exe
  • macOS:
    /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
  • Linux:
    /usr/bin/google-chrome
    ,
    /usr/bin/chromium
If system Chrome is not found, there are two solutions:
  1. Install Chrome browser (recommended)
  2. Run
    npx puppeteer browsers install chrome
    to install Chromium
If Chrome is in a non-standard path, set the environment variable:
bash
# Linux/macOS
export PUPPETEER_EXECUTABLE_PATH="/path/to/chrome"
# Windows PowerShell
$env:PUPPETEER_EXECUTABLE_PATH="C:\path\to\chrome.exe"
Platform-Specific Prerequisites:
PlatformDependencies
WebChrome/Chromium browser (auto-detected, see above instructions)
AndroidADB-connected device (verify with
adb devices
)
iOSWebDriverAgent configured
Extended mode
tsx
runtime (check with
npx tsx --version
)
If dependencies are missing, prompt the user to install:
bash
npm install && npm run setup

Step 1: Locate YAML File

Determine the path of the YAML file to execute. If the user does not specify the full path:
  • Check the most recently generated files in the
    ./midscene-output/
    directory
  • Check template files in the
    ./templates/
    directory
  • Check
    .yaml
    /
    .yml
    files in the current directory
  • Prompt the user to provide the file path
Multi-file scenario: If the user requests to execute multiple files, execute them one by one and summarize the results.

Step 2: Pre-Validation

Before execution, call the validator to check the YAML file:
bash
node scripts/midscene-run.js <yaml-file> --dry-run
If validation fails, analyze the error cause and suggest fixes to the user:
Common ErrorCauseFix Suggestion
YAML syntax errorIncorrect indentation or formatting issuesCheck indentation, use 2 spaces consistently
Missing platform configurationNo web/android/ios/computer configurationAdd
web: { url: "..." }
Missing tasksNo tasks definedAdd tasks array and flow
Undeclared engineUsed superset keywords but not markedAdd
engine: extended
Undefined variableReferenced an undeclared
${var}
Declare variables in the variables section
Undeclared featuresExtended mode did not list used featuresAdd
features: [...]
Missing required fields in looprepeat lacks count, while lacks conditionAdd the corresponding required fields
Imported file does not existIncorrect path referenced in importCheck if the file path is correct
Auto-Fix Flow: If the error can be fixed automatically (e.g., missing engine declaration), fix it directly and re-validate to avoid round-trip confirmation.

Step 3: Execution

Choose the execution method based on the project environment:
Method 1 (Recommended): Use Project CLI
If the project has
scripts/midscene-run.js
(complete midi-stagehand-skill project):
bash
# Single file execution
node scripts/midscene-run.js <yaml-file> [options]

# Batch execution (glob pattern)
node scripts/midscene-run.js "tests/**/*.yaml"
Method 2: Use Midscene CLI Directly
If in an external project (without
scripts/midscene-run.js
), use
@midscene/web
directly:
bash
# Installation (first time only)
npm install @midscene/web dotenv

# Execute with UI (recommended for debugging)
npx @midscene/web <yaml-file> --headed

# Batch execution (official CLI option)
npx @midscene/web "tests/**/*.yaml" --concurrent --continue-on-error
Note: The package name is
@midscene/web
(not
@midscene/cli
). The official CLI syntax is
npx @midscene/web <yaml-file>
, which supports the
run
subcommand (
npx @midscene/web run <yaml-file>
); both forms are acceptable.
Available Options (Method 1):
  • --dry-run
    — Only validate and transform, do not execute (Note: Does not detect model configuration; AI operations require
    MIDSCENE_MODEL_API_KEY
    to be configured)
  • --timeout <ms>
    — Execution timeout (default 300000 = 5 minutes). Increase this value for long-running automation scenarios
  • --output-ts <path>
    — Save the transformed TypeScript file (Extended mode only). When troubleshooting transpilation errors, it is recommended to use with
    --dry-run
  • --report-dir <path>
    — Report output directory (default
    ./midscene-report
    )
  • --template puppeteer|playwright
    — Select TS template (default puppeteer; playwright is suitable for scenarios requiring multi-browser compatibility)
  • --verbose
    /
    -v
    — Show detailed output (validation details, detection information, environment information)
  • --help
    /
    -h
    — Show help information
Extended Mode Execution Flow:
  1. YAML → Transpiler → TypeScript
  2. TypeScript → tsx runtime → Playwright + Midscene SDK
  3. Generate execution report
You can use
--output-ts
to save the intermediate TypeScript file for debugging:
bash
node scripts/midscene-run.js test.yaml --output-ts ./debug-output.ts

Step 4: Analyze Results

After execution completes:

Success

  • Report execution summary (number of passed/failed tasks)
  • Inform the user of the report file location
  • If there are
    aiQuery
    results, display the extracted data
  • If
    output
    is exported, confirm the file generation location

Failure

Analyze errors and fix them according to the following decision tree:
What does the error message contain?
├─ "API key" / "401" / "Unauthorized"
│   → Model not configured. Set the MIDSCENE_MODEL_API_KEY environment variable or .env file
├─ "Timeout" / "exceeded"
│   ├─ Can the page open normally in the browser?
│   │   ├─ Yes → Page loads slowly, increase the timeout value (e.g., timeout: 30000)
│   │   └─ No → Check if the URL is correct and network is reachable
│   └─ Occurs in aiWaitFor? → Condition description may be inaccurate, check assertion text
├─ "Element not found" / "not found"
│   ├─ Failed on first attempt? → AI description is not precise enough, use more specific text descriptions
│   ├─ Succeeded before? → Page structure may have changed, compare with report screenshots
│   └─ Still failing? → Try deepThink: true or use xpath positioning instead
├─ "Assertion failed"
│   → View report screenshots, compare actual page state vs expected description, adjust aiAssert text
├─ "Navigation failed" / "net::ERR_"
│   → Check URL protocol (https://) and accessibility
├─ "Transpiler error"
│   → Use --dry-run --output-ts ./debug.ts to view generated code and troubleshoot syntax issues
├─ "Permission denied"
│   → Page requires login or special permissions, add login steps or cookie configuration
└─ "javascript" step error
    → Check JS code syntax, note API differences between browser environment and Node environment
Iterative Fix Flow:
  1. Analyze error cause
  2. Modify the YAML file
  3. Re-run
    --dry-run
    for validation
  4. Re-execute after validation passes

Step 5: Report Interpretation

Interpret the report generated by Midscene:
  • Reports are saved in the
    ./midscene-report/
    directory by default
  • HTML Report: Open in a browser, each step shows execution status and screenshots (green ✓ = passed, red ✗ = failed), click to expand details
  • JSON Report: Structured data containing the status, duration, and screenshot path of each step, suitable for automatic parsing in CI/CD
  • Screenshot paths are relative to the report directory
  • Custom screenshots generated by the
    recordToReport
    step are also included in the report
Report summary format:
Total : N
Passed: N
Failed: N
Status: passed|failed

Quick Execution Command Reference

Using Project CLI (Complete Project)

bash
# Basic execution
node scripts/midscene-run.js test.yaml

# Validate only, no execution
node scripts/midscene-run.js test.yaml --dry-run

# Save generated TS (Extended mode only)
node scripts/midscene-run.js test.yaml --output-ts ./output.ts

# Use Playwright template
node scripts/midscene-run.js test.yaml --template playwright

# Specify report directory
node scripts/midscene-run.js test.yaml --report-dir ./reports

# Set timeout to 10 minutes
node scripts/midscene-run.js test.yaml --timeout 600000

# Validate + save TS (troubleshoot transpilation issues)
node scripts/midscene-run.js test.yaml --dry-run --output-ts ./debug.ts

# View help
node scripts/midscene-run.js --help

Using Midscene CLI Directly (External Project)

bash
# Execute with UI (recommended for debugging)
npx @midscene/web test.yaml --headed

# Headless execution (recommended for CI/CD)
npx @midscene/web test.yaml

YAML Configuration Quick Reference

Agent Configuration (Optional)

yaml
agent:
  testId: "test-001"
  groupName: "Regression Test"
  groupDescription: "Daily regression test suite"
  cache: true

continueOnError (Optional)

Continue executing subsequent tasks after a task fails:
yaml
tasks:
  - name: Task A
    continueOnError: true
    flow: [...]
  - name: Task B (executes even if A fails)
    flow: [...]

Platform Configuration Options

yaml
# Full Web platform configuration
web:
  url: "https://example.com"
  headless: false       # true = headless mode (suitable for CI/CD); false = UI mode (suitable for debugging)
  viewportWidth: 1920   # Default 1280; use 375 for mobile simulation
  viewportHeight: 1080  # Default 720; use 667 for mobile simulation
  userAgent: "Custom User Agent"
  waitForNetworkIdle:
    timeout: 2000
    continueOnNetworkIdleError: true

# Android platform (ensure device is connected via adb first)
android:
  deviceId: "emulator-5554"   # Device ID from adb devices output
# Use launch: "com.example.app" in flow to start the app

# iOS platform (WebDriverAgent must be configured first)
ios:
  wdaPort: 8100              # WebDriverAgent port
  wdaHost: "localhost"       # WebDriverAgent host
# Use launch: "com.example.app" in flow to start the app

Debugging Tips

  1. View Report Screenshots: Check the HTML report after execution, each step has a screenshot
  2. Execute in Segments: Verify the first few steps first, then add more steps gradually
  3. Add Waits: Add
    aiWaitFor
    after key steps to ensure the page is ready
  4. Insert Assertions: Insert
    aiAssert
    in intermediate steps to verify current state
  5. View TS Code: Use
    --output-ts
    in Extended mode to view generated code for troubleshooting
  6. Use deepThink: Enable
    deepThink: true
    when element positioning is inaccurate
  7. Downgrade to xpath: Use xpath for precise selection when natural language positioning fails
  8. Use javascript: Execute JS code directly via
    javascript
    step to debug page state
  9. Use recordToReport: Insert
    recordToReport
    at key nodes to capture screenshots for records

Notes

  • Chrome/Chromium browser is required to execute Web platform tests
  • Dependencies may need to be installed on first run:
    npm install
  • Android tests require an ADB-connected device, iOS tests require WebDriverAgent
  • YAML in Extended mode is first converted to TypeScript before execution, requiring the tsx runtime
  • Screenshot paths in reports are relative, look within the report directory
  • If you need to generate a new YAML file, you can use the Midscene YAML Generator skill
  • Environment variables are passed via system environment or
    .env
    file, referenced in YAML using
    ${ENV:NAME}
    or
    ${ENV.NAME}
    (both syntaxes are equivalent)
  • parallel
    branches run in independent browser contexts and do not affect each other during execution;
    aiQuery
    results from each branch can be accessed together after all branches complete (via
    merge_results: true
    )
  • --dry-run
    only checks YAML syntax and structure, does not detect model configuration or network reachability
  • If
    npx skills check
    does not detect existing updates, the lock file format may be outdated (v1); reinstall to upgrade to v3 format:
    npx skills add https://github.com/lee-117/midi-stagehand-skill -a claude-code

Collaboration Agreement

When collaborating with Generator Skill:
  1. Priority Check: Check the most recently generated files in the
    ./midscene-output/
    directory first
  2. On Execution Failure: Provide structured error information:
    • Error type, error location (step/line number), suggested fix
  3. If the error can be fixed by modifying the YAML file, modify it directly and re-execute (no need to callback Generator)