midscene-runner

Trigger Conditions

Used when users need to execute, debug, or validate Midscene YAML files.

Common trigger phrases:

"Run this YAML"
"Execute XXX.yaml"
"Test this automation script"
"Run this test case"
"Validate if this YAML is correct"
"Debug this automation flow"
"Batch execute these tests"

English trigger phrases:

"Run this YAML"
"Execute XXX.yaml"
"Test this automation script"
"Validate this YAML file"
"Debug this automation flow"
"Run the test cases"

Workflow

Step 0: Environment Check

Before the first execution, use the one-click health check to confirm the runtime environment is ready:

bash

node scripts/health-check.js

This script checks: Node.js version, dependency installation, CLI scripts,

@midscene/web

tsx

runtime, AI model configuration, Chrome browser.

Model not configured? Midscene requires a vision-language model to perform AI operations. Create a

.env

file in the project root directory:

env

MIDSCENE_MODEL_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
MIDSCENE_MODEL_API_KEY=sk-your-key
MIDSCENE_MODEL_NAME=qwen-vl-max-latest

For detailed configuration instructions, see Midscene Model Configuration Documentation.

First-time use? Run one-click environment initialization:

bash

npm run setup

setup

will automatically complete the following tasks:

Intelligently detect the network environment and automatically select the fastest npm mirror (Taobao mirror is used by default in China for acceleration)
Install all project dependencies
Preheat
```
@midscene/web
```
and
```
tsx
```
to the npx cache (avoid waiting for downloads during first execution)
Detect system Chrome; if not found, automatically download Chromium
Output environment readiness report

Chrome Browser Detection (Web Platform):

The framework will automatically search for system Chrome in the following order:

Windows:

Program Files\Google\Chrome\Application\chrome.exe

macOS:

/Applications/Google Chrome.app/Contents/MacOS/Google Chrome

Linux:
```
/usr/bin/google-chrome
```
,
```
/usr/bin/chromium
```

If system Chrome is not found, there are two solutions:

Install Chrome browser (recommended)
Run
```
npx puppeteer browsers install chrome
```
to install Chromium

If Chrome is in a non-standard path, set the environment variable:

bash

# Linux/macOS
export PUPPETEER_EXECUTABLE_PATH="/path/to/chrome"
# Windows PowerShell
$env:PUPPETEER_EXECUTABLE_PATH="C:\path\to\chrome.exe"

Platform-Specific Prerequisites:

Platform	Dependencies
Web	Chrome/Chromium browser (auto-detected, see above instructions)
Android	ADB-connected device (verify with `adb devices` )
iOS	WebDriverAgent configured
Extended mode	`tsx` runtime (check with `npx tsx --version` )

If dependencies are missing, prompt the user to install:

bash

npm install && npm run setup

Step 1: Locate YAML File

Determine the path of the YAML file to execute. If the user does not specify the full path:

Check the most recently generated files in the
```
./midscene-output/
```
directory
Check template files in the
```
./templates/
```
directory
Check
```
.yaml
```
/
```
.yml
```
files in the current directory
Prompt the user to provide the file path

Multi-file scenario: If the user requests to execute multiple files, execute them one by one and summarize the results.

Step 2: Pre-Validation

Before execution, call the validator to check the YAML file:

bash

node scripts/midscene-run.js <yaml-file> --dry-run

If validation fails, analyze the error cause and suggest fixes to the user:

Common Error	Cause	Fix Suggestion
YAML syntax error	Incorrect indentation or formatting issues	Check indentation, use 2 spaces consistently
Missing platform configuration	No web/android/ios/computer configuration	Add `web: { url: "..." }`
Missing tasks	No tasks defined	Add tasks array and flow
Undeclared engine	Used superset keywords but not marked	Add `engine: extended`
Undefined variable	Referenced an undeclared `${var}`	Declare variables in the variables section
Undeclared features	Extended mode did not list used features	Add `features: [...]`
Missing required fields in loop	repeat lacks count, while lacks condition	Add the corresponding required fields
Imported file does not exist	Incorrect path referenced in import	Check if the file path is correct

Auto-Fix Flow: If the error can be fixed automatically (e.g., missing engine declaration), fix it directly and re-validate to avoid round-trip confirmation.

Step 3: Execution

Choose the execution method based on the project environment:

Method 1 (Recommended): Use Project CLI

If the project has

scripts/midscene-run.js

(complete midi-stagehand-skill project):

bash

# Single file execution
node scripts/midscene-run.js <yaml-file> [options]

# Batch execution (glob pattern)
node scripts/midscene-run.js "tests/**/*.yaml"

Method 2: Use Midscene CLI Directly

If in an external project (without

scripts/midscene-run.js

), use

@midscene/web

directly:

bash

# Installation (first time only)
npm install @midscene/web dotenv

# Execute with UI (recommended for debugging)
npx @midscene/web <yaml-file> --headed

# Batch execution (official CLI option)
npx @midscene/web "tests/**/*.yaml" --concurrent --continue-on-error

Note: The package name is
@midscene/web
(not
@midscene/cli
). The official CLI syntax is
npx @midscene/web <yaml-file>
, which supports the
run
subcommand (
npx @midscene/web run <yaml-file>
); both forms are acceptable.

Available Options (Method 1):

```
--dry-run
```
— Only validate and transform, do not execute (Note: Does not detect model configuration; AI operations require
```
MIDSCENE_MODEL_API_KEY
```
to be configured)
```
--timeout <ms>
```
— Execution timeout (default 300000 = 5 minutes). Increase this value for long-running automation scenarios
```
--output-ts <path>
```
— Save the transformed TypeScript file (Extended mode only). When troubleshooting transpilation errors, it is recommended to use with
```
--dry-run
```
```
--report-dir <path>
```
— Report output directory (default
```
./midscene-report
```
)
```
--template puppeteer|playwright
```
— Select TS template (default puppeteer; playwright is suitable for scenarios requiring multi-browser compatibility)
```
--verbose
```
/
```
-v
```
— Show detailed output (validation details, detection information, environment information)
```
--help
```
/
```
-h
```
— Show help information

Extended Mode Execution Flow:

YAML → Transpiler → TypeScript
TypeScript → tsx runtime → Playwright + Midscene SDK
Generate execution report

You can use

--output-ts

to save the intermediate TypeScript file for debugging:

bash

node scripts/midscene-run.js test.yaml --output-ts ./debug-output.ts

Step 4: Analyze Results

After execution completes:

Success

Report execution summary (number of passed/failed tasks)
Inform the user of the report file location
If there are
```
aiQuery
```
results, display the extracted data
If
```
output
```
is exported, confirm the file generation location

Failure

Analyze errors and fix them according to the following decision tree:

What does the error message contain?
├─ "API key" / "401" / "Unauthorized"
│   → Model not configured. Set the MIDSCENE_MODEL_API_KEY environment variable or .env file
│
├─ "Timeout" / "exceeded"
│   ├─ Can the page open normally in the browser?
│   │   ├─ Yes → Page loads slowly, increase the timeout value (e.g., timeout: 30000)
│   │   └─ No → Check if the URL is correct and network is reachable
│   └─ Occurs in aiWaitFor? → Condition description may be inaccurate, check assertion text
│
├─ "Element not found" / "not found"
│   ├─ Failed on first attempt? → AI description is not precise enough, use more specific text descriptions
│   ├─ Succeeded before? → Page structure may have changed, compare with report screenshots
│   └─ Still failing? → Try deepThink: true or use xpath positioning instead
│
├─ "Assertion failed"
│   → View report screenshots, compare actual page state vs expected description, adjust aiAssert text
│
├─ "Navigation failed" / "net::ERR_"
│   → Check URL protocol (https://) and accessibility
│
├─ "Transpiler error"
│   → Use --dry-run --output-ts ./debug.ts to view generated code and troubleshoot syntax issues
│
├─ "Permission denied"
│   → Page requires login or special permissions, add login steps or cookie configuration
│
└─ "javascript" step error
    → Check JS code syntax, note API differences between browser environment and Node environment

Iterative Fix Flow:

Analyze error cause
Modify the YAML file
Re-run
```
--dry-run
```
for validation
Re-execute after validation passes

Step 5: Report Interpretation

Interpret the report generated by Midscene:

Reports are saved in the
```
./midscene-report/
```
directory by default
HTML Report: Open in a browser, each step shows execution status and screenshots (green ✓ = passed, red ✗ = failed), click to expand details
JSON Report: Structured data containing the status, duration, and screenshot path of each step, suitable for automatic parsing in CI/CD
Screenshot paths are relative to the report directory
Custom screenshots generated by the
```
recordToReport
```
step are also included in the report

Report summary format:

Total : N
Passed: N
Failed: N
Status: passed|failed

Quick Execution Command Reference

Using Project CLI (Complete Project)

bash

# Basic execution
node scripts/midscene-run.js test.yaml

# Validate only, no execution
node scripts/midscene-run.js test.yaml --dry-run

# Save generated TS (Extended mode only)
node scripts/midscene-run.js test.yaml --output-ts ./output.ts

# Use Playwright template
node scripts/midscene-run.js test.yaml --template playwright

# Specify report directory
node scripts/midscene-run.js test.yaml --report-dir ./reports

# Set timeout to 10 minutes
node scripts/midscene-run.js test.yaml --timeout 600000

# Validate + save TS (troubleshoot transpilation issues)
node scripts/midscene-run.js test.yaml --dry-run --output-ts ./debug.ts

# View help
node scripts/midscene-run.js --help

Using Midscene CLI Directly (External Project)

bash

# Execute with UI (recommended for debugging)
npx @midscene/web test.yaml --headed

# Headless execution (recommended for CI/CD)
npx @midscene/web test.yaml

YAML Configuration Quick Reference

Agent Configuration (Optional)

yaml

agent:
  testId: "test-001"
  groupName: "Regression Test"
  groupDescription: "Daily regression test suite"
  cache: true

continueOnError (Optional)

Continue executing subsequent tasks after a task fails:

yaml

tasks:
  - name: Task A
    continueOnError: true
    flow: [...]
  - name: Task B (executes even if A fails)
    flow: [...]

Platform Configuration Options

yaml

# Full Web platform configuration
web:
  url: "https://example.com"
  headless: false       # true = headless mode (suitable for CI/CD); false = UI mode (suitable for debugging)
  viewportWidth: 1920   # Default 1280; use 375 for mobile simulation
  viewportHeight: 1080  # Default 720; use 667 for mobile simulation
  userAgent: "Custom User Agent"
  waitForNetworkIdle:
    timeout: 2000
    continueOnNetworkIdleError: true

# Android platform (ensure device is connected via adb first)
android:
  deviceId: "emulator-5554"   # Device ID from adb devices output
# Use launch: "com.example.app" in flow to start the app

# iOS platform (WebDriverAgent must be configured first)
ios:
  wdaPort: 8100              # WebDriverAgent port
  wdaHost: "localhost"       # WebDriverAgent host
# Use launch: "com.example.app" in flow to start the app

Debugging Tips

View Report Screenshots: Check the HTML report after execution, each step has a screenshot
Execute in Segments: Verify the first few steps first, then add more steps gradually
Add Waits: Add
```
aiWaitFor
```
after key steps to ensure the page is ready
Insert Assertions: Insert
```
aiAssert
```
in intermediate steps to verify current state
View TS Code: Use
```
--output-ts
```
in Extended mode to view generated code for troubleshooting
Use deepThink: Enable
```
deepThink: true
```
when element positioning is inaccurate
Downgrade to xpath: Use xpath for precise selection when natural language positioning fails
Use javascript: Execute JS code directly via
```
javascript
```
step to debug page state
Use recordToReport: Insert
```
recordToReport
```
at key nodes to capture screenshots for records

Notes

Chrome/Chromium browser is required to execute Web platform tests
Dependencies may need to be installed on first run:
```
npm install
```
Android tests require an ADB-connected device, iOS tests require WebDriverAgent
YAML in Extended mode is first converted to TypeScript before execution, requiring the tsx runtime
Screenshot paths in reports are relative, look within the report directory
If you need to generate a new YAML file, you can use the Midscene YAML Generator skill
Environment variables are passed via system environment or
```
.env
```
file, referenced in YAML using
```
${ENV:NAME}
```
or
```
${ENV.NAME}
```
(both syntaxes are equivalent)
```
parallel
```
branches run in independent browser contexts and do not affect each other during execution;
```
aiQuery
```
results from each branch can be accessed together after all branches complete (via
```
merge_results: true
```
)
```
--dry-run
```
only checks YAML syntax and structure, does not detect model configuration or network reachability
If
```
npx skills check
```
does not detect existing updates, the lock file format may be outdated (v1); reinstall to upgrade to v3 format:
```
npx skills add https://github.com/lee-117/midi-stagehand-skill -a claude-code
```

Collaboration Agreement

When collaborating with Generator Skill:

Priority Check: Check the most recently generated files in the
```
./midscene-output/
```
directory first
On Execution Failure: Provide structured error information:
- Error type, error location (step/line number), suggested fix
If the error can be fixed by modifying the YAML file, modify it directly and re-execute (no need to callback Generator)

midscene-runner

NPX Install

Tags

SKILL.md Content (Chinese)

Trigger Conditions

Workflow

Step 0: Environment Check

Step 1: Locate YAML File

Step 2: Pre-Validation

Step 3: Execution

Step 4: Analyze Results

Success

Failure

Step 5: Report Interpretation

Quick Execution Command Reference

Using Project CLI (Complete Project)

Using Midscene CLI Directly (External Project)

YAML Configuration Quick Reference

Agent Configuration (Optional)

continueOnError (Optional)

Platform Configuration Options

Debugging Tips

Notes

Collaboration Agreement