create_e2e_tests

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Create Local YAML Tests

创建本地YAML测试用例

A spec-driven workflow that front-loads testing expertise through structured planning before any tests are written. Tests run with
npx shiplight test --headed
— no cloud infrastructure required.
这是一种基于规范的工作流,在编写任何测试用例之前,通过结构化规划提前融入测试专业知识。测试可通过
npx shiplight test --headed
运行——无需云基础设施。

When to use

使用场景

Use
/create_e2e_tests
when the user wants to:
  • Create a new local test project from scratch
  • Add YAML tests for a web application
  • Set up authentication for a test project
  • Plan what to test before writing tests
当用户有以下需求时,可使用
/create_e2e_tests
  • 从零开始创建新的本地测试项目
  • 为Web应用添加YAML测试用例
  • 为测试项目配置认证
  • 在编写测试用例前规划测试内容

Principles

核心原则

  1. Always produce artifacts. Every phase writes a markdown file. Artifacts clarify your own thinking, give the user something to review, and guide later phases. When the user provides detailed requirements, use them as source material — skip questions already answered, but still produce the artifact.
  2. Confirm before implementing. Present the spec (Phase 2 checkpoint) for user confirmation before spending time on browser-walking and test writing. Echo back your understanding as structured scenarios to catch mismatches early.
  3. Each phase reads the previous phase's artifact. Discover feeds Specify, Specify feeds Plan, Plan feeds Implement, Implement feeds Verify. If an artifact exists from a prior run, offer to reuse it.
  4. Escalate, don't loop. When something fails or is ambiguous, report it and ask the user rather than retrying silently.
  1. 始终生成产物:每个阶段都会生成一个Markdown文件。产物能明确你的思路,为用户提供可审阅的内容,并指导后续阶段。当用户提供详细需求时,将其作为素材——跳过已解答的问题,但仍需生成对应产物。
  2. 确认后再实施:在投入时间进行浏览器遍历和测试编写前,先向用户展示规范(阶段2检查点)以确认。将你的理解整理为结构化场景反馈给用户,尽早发现偏差。
  3. 各阶段衔接前序产物:探索阶段为规范阶段提供输入,规范阶段为规划阶段提供输入,规划阶段为实施阶段提供输入,实施阶段为验证阶段提供输入。如果之前运行已生成产物,可提议复用。
  4. 上报而非循环重试:当出现失败或歧义时,及时告知用户并询问,而非默默重试。

Phase Overview

阶段概览

Phase 1: Discover  → test-strategy.md    (understand the app & user goals)
Phase 2: Specify   → test-spec.md        (define what to test in Given/When/Then)
Phase 3: Plan      → test-plan.md        (prioritize, structure, per-test guidance)
Phase 4: Implement → *.test.yaml files   (setup project, write tests, run them)
Phase 5: Verify    → updated spec files  (coverage check, reconcile spec ↔ tests)
阶段1:探索  → test-strategy.md    (了解应用与用户目标)
阶段2:规范  → test-spec.md        (用Given/When/Then格式定义测试内容)
阶段3:规划  → test-plan.md        (优先级排序、结构梳理、单测试指导)
阶段4:实施  → *.test.yaml文件   (搭建项目、编写测试、运行测试)
阶段5:验证  → 更新后的规范文件  (覆盖率检查、协调规范与测试用例)

Fast-Track

快速通道

Check for existing artifacts before starting. The only way to skip artifact generation is if the user explicitly says so.
SituationBehavior
User explicitly says "skip to implement" or "just write the tests"Phase 4 only
Existing
test-specs/test-strategy.md
Offer to reuse, skip Phase 1
Existing
test-specs/test-spec.md
Offer to reuse, skip Phases 1-2
Existing
test-specs/test-plan.md
Offer to reuse, skip to Phase 4

开始前检查是否存在已有产物。只有当用户明确要求时,才可以跳过产物生成步骤。
场景处理方式
用户明确表示“直接进入实施阶段”或“只写测试用例”仅执行阶段4
已存在
test-specs/test-strategy.md
提议复用该文件,跳过阶段1
已存在
test-specs/test-spec.md
提议复用该文件,跳过阶段1-2
已存在
test-specs/test-plan.md
提议复用该文件,直接进入阶段4

Phase 1: Discover

阶段1:探索

Goal: Understand the application, the user's role, and what matters most to test.
Output:
<project>/test-specs/test-strategy.md
目标:了解应用、用户角色以及最需要测试的内容。
输出
<project>/test-specs/test-strategy.md

Steps

步骤

  1. Get project path — ask where to create the test project (e.g.,
    ./my-tests
    ). All artifacts and tests will live here. Create the
    test-specs/
    directory.
    If cloud MCP tools are available (
    SHIPLIGHT_API_TOKEN
    is set), use the
    /cloud
    skill to fetch environments and test accounts — this can pre-fill the target URL and credentials.
  2. Silent scan — before asking questions, gather context from what's available:
    • Codebase: routes, components,
      package.json
      , framework
    • Git branch diff (what changed recently)
    • Existing tests (what's already covered)
    • PRDs, docs, README files
    • Cloud environments (if cloud MCP tools available)
  3. Understand what to test — ask the user what they'd like to test, then ask targeted follow-up questions (one at a time, with recommendations based on your scan) to fill gaps: risk areas, user roles, authentication, data strategy, critical journeys. Skip questions the user has already answered.
  4. Write
    test-strategy.md
    containing:
    • App profile: name, URL, framework, key pages/features
    • Risk profile: what matters most, what's fragile
    • Testing scope: what's in/out, user roles to cover
    • Data strategy: how test data will be created and cleaned up
    • Environment: target URL, auth method, any special setup

  1. 获取项目路径 — 询问用户要在哪里创建测试项目(例如:
    ./my-tests
    )。所有产物和测试用例都将存放在此处。创建
    test-specs/
    目录。
    如果云MCP工具可用(已设置
    SHIPLIGHT_API_TOKEN
    ),使用
    /cloud
    技能获取环境和测试账号信息——这可预先填充目标URL和凭证。
  2. 静默扫描 — 在提问前,收集可用的上下文信息:
    • 代码库:路由、组件、
      package.json
      、框架
    • Git分支差异(最近的变更内容)
    • 已有测试用例(已覆盖的内容)
    • PRD、文档、README文件
    • 云环境(若云MCP工具可用)
  3. 明确测试内容 — 询问用户想要测试的内容,然后根据扫描结果提出针对性的跟进问题(一次一个问题,并给出建议)以填补信息空白:风险区域、用户角色、认证方式、数据策略、关键流程。跳过用户已解答的问题。
  4. 编写
    test-strategy.md
    ,内容包括:
    • 应用概况:名称、URL、框架、关键页面/功能
    • 风险概况:最关键的内容、最脆弱的部分
    • 测试范围:包含/排除的内容、要覆盖的用户角色
    • 数据策略:测试数据的创建与清理方式
    • 环境信息:目标URL、认证方法、特殊配置

Phase 2: Specify

阶段2:规范

Goal: Define concrete test scenarios in structured Given/When/Then format, prioritized by risk. Surface ambiguities that would cause flaky or incomplete tests.
Input: reads
test-specs/test-strategy.md
Output:
<project>/test-specs/test-spec.md
目标:用结构化的Given/When/Then格式定义具体测试场景,并按风险优先级排序。找出可能导致测试不稳定或不完整的歧义点。
输入:读取
test-specs/test-strategy.md
输出
<project>/test-specs/test-spec.md

Steps

步骤

  1. Read
    test-strategy.md
    to understand scope and priorities.
  2. Generate user journey specs — for each critical journey, write:
    • Title: descriptive name (e.g., "New user signup with email verification")
    • Priority: P0 (must-have), P1 (should-have), P2 (nice-to-have)
    • Preconditions: what must be true before the test starts (Given)
    • Happy path: step-by-step actions and expected outcomes (When/Then)
    • Edge cases: at least 2 per journey (e.g., invalid input, timeout, empty state)
    • Data requirements: what test data is needed
  3. Review for testing risks — scan each journey for issues that would cause flaky or incomplete tests: data dependencies, timing/async behavior, dynamic content, auth boundaries, third-party services, state isolation, environment differences. Add a Testing Notes section to each journey with identified risks and mitigations. If anything is ambiguous, ask the user (one at a time, with a recommended answer and impact statement).
  4. Write
    test-spec.md
    with all journey specs.
  5. Checkpoint — present a summary table for user review:
    #JourneyPriorityStepsEdge CasesRisks
    1User signupP053Timing
    2...............
    Ask: "Does this look right? Any journeys to add, remove, or reprioritize?"
    Wait for user confirmation before proceeding.

  1. 读取
    test-strategy.md
    以了解测试范围和优先级。
  2. 生成用户流程规范 — 针对每个关键流程,编写:
    • 标题:描述性名称(例如:“新用户邮箱验证注册”)
    • 优先级:P0(必须覆盖)、P1(应该覆盖)、P2(可选覆盖)
    • 前置条件:测试开始前必须满足的条件(Given)
    • 正常流程:分步操作与预期结果(When/Then)
    • 边缘场景:每个流程至少包含2个边缘场景(例如:无效输入、超时、空状态)
    • 数据需求:所需的测试数据
  3. 检查测试风险 — 扫描每个流程,找出可能导致测试不稳定或不完整的问题:数据依赖、时序/异步行为、动态内容、认证边界、第三方服务、状态隔离、环境差异。为每个流程添加测试注意事项部分,列出已识别的风险及缓解措施。若存在歧义,询问用户(一次一个问题,并给出建议答案和影响说明)。
  4. 编写
    test-spec.md
    ,包含所有流程规范。
  5. 检查点 — 向用户展示汇总表格以供审阅:
    #流程优先级步骤数边缘场景数风险
    1用户注册P053时序问题
    2...............
    询问:“这份内容是否符合预期?是否需要添加、删除或重新调整流程优先级?”
    等待用户确认后再继续。

Phase 3: Plan

阶段3:规划

Goal: Create an actionable implementation plan with per-test guidance.
Input: reads
test-specs/test-spec.md
Output:
<project>/test-specs/test-plan.md
目标:创建可执行的实施计划,包含单测试用例的指导信息。
输入:读取
test-specs/test-spec.md
输出
<project>/test-specs/test-plan.md

Steps

步骤

  1. Read
    test-spec.md
    .
  2. Define test file structure — map journeys to test files:
    tests/
    ├── auth.setup.ts          (if auth needed)
    ├── signup.test.yaml        (Journey 1)
    ├── checkout.test.yaml      (Journey 2)
    └── ...
  3. Set implementation order — ordered by:
    • Dependencies first (auth setup before authenticated tests)
    • Then by priority (P0 before P1)
    • Then by risk (highest risk first)
  4. Per-test guidance — for each test file, specify:
    • Data strategy: what data to create/use, cleanup approach
    • Wait strategy: where to use WAIT_UNTIL vs WAIT, expected loading points
    • Flakiness risks: specific things to watch for in this test
  5. Write
    test-plan.md
    .
  6. Checkpoint — present summary:
    Ready to implement N test files. Shall I proceed?

  1. 读取
    test-spec.md
  2. 定义测试文件结构 — 将流程映射到测试文件:
    tests/
    ├── auth.setup.ts          (若需要认证)
    ├── signup.test.yaml        (流程1)
    ├── checkout.test.yaml      (流程2)
    └── ...
  3. 设置实施顺序 — 按以下顺序排列:
    • 先处理依赖项(认证配置优先于需认证的测试)
    • 再按优先级排序(P0优先于P1)
    • 最后按风险排序(高风险优先)
  4. 单测试用例指导 — 针对每个测试文件,指定:
    • 数据策略:要创建/使用的数据、清理方式
    • 等待策略:在哪些位置使用WAIT_UNTIL与WAIT、预期加载点
    • 不稳定风险:该测试需要注意的具体问题
  5. 编写
    test-plan.md
  6. 检查点 — 展示汇总信息:
    准备实施N个测试文件。是否继续?

Phase 4: Implement

阶段4:实施

Goal: Set up the project and write all YAML tests guided by the plan.
Input: reads
test-specs/test-plan.md
目标:搭建项目并根据计划编写所有YAML测试用例。
输入:读取
test-specs/test-plan.md

Setup

项目搭建

Skip any steps already done (project exists, deps installed, auth configured).
  1. Configure AI provider — check if the test project already has a
    .env
    with an AI API key. If not, ask the user to choose a provider:
    To run YAML tests, I need an AI provider for resolving test steps. Which provider would you like to use?
    A) Google AI
    GOOGLE_API_KEY
    (Get key) — default model:
    gemini-3.1-flash-lite-preview
    B) Anthropic
    ANTHROPIC_API_KEY
    (Get key) — default model:
    claude-haiku-4-5
    C) OpenAI
    OPENAI_API_KEY
    (Get key) — default model:
    gpt-5.4-mini
    D) Azure OpenAI — requires
    AZURE_OPENAI_API_KEY
    +
    AZURE_OPENAI_ENDPOINT
    — set
    WEB_AGENT_MODEL=azure:<deployment>
    E) AWS Bedrock — uses AWS credential chain — set
    WEB_AGENT_MODEL=bedrock:<model_id>
    F) Google Vertex AI — uses GCP Application Default Credentials — set
    WEB_AGENT_MODEL=vertex:<model>
    G) I already have it configured
    After the user chooses, ask for their API key and save it to the test project's
    .env
    file. For A/B/C, the model is auto-detected from the key. For D/E/F, also save
    WEB_AGENT_MODEL
    with the appropriate
    provider:model
    prefix. Optionally, the user can set
    WEB_AGENT_MODEL
    to override the default model (e.g.,
    WEB_AGENT_MODEL=claude-sonnet-4-6
    ).
  2. Scaffold the project — call
    scaffold_project
    with the absolute project path. This creates
    package.json
    ,
    playwright.config.ts
    ,
    .env.example
    ,
    .gitignore
    , and
    tests/
    . Save the API key to
    .env
    .
  3. Install dependencies:
    bash
    npm install
    npx playwright install chromium
  4. Set up authentication (if needed) — follow the standard Playwright authentication pattern.
    Add credentials as variables in
    playwright.config.ts
    :
    ts
    {
      name: 'my-app',
      testDir: './tests/my-app',
      dependencies: ['my-app-setup'],
      use: {
        baseURL: 'https://app.example.com',
        storageState: 'tests/my-app/.auth/storage-state.json',
        variables: {
          username: process.env.MY_APP_EMAIL,
          password: { value: process.env.MY_APP_PASSWORD, sensitive: true },
          // otp_secret_key: { value: process.env.MY_APP_TOTP_SECRET, sensitive: true },
        },
      },
    },
    Standard variable names:
    username
    ,
    password
    ,
    otp_secret_key
    . Use
    { value, sensitive: true }
    for secrets. Add values to
    .env
    .
    Write
    auth.setup.ts
    with standard Playwright login code. For TOTP, implement RFC 6238 using
    node:crypto
    (HMAC-SHA1 + base32 decode) — no third-party dependency needed.
    Verify auth before proceeding. Run
    npx shiplight test --headed
    to execute the auth setup and confirm it saves
    storage-state.json
    . If it fails, escalate to the user — auth is a prerequisite for everything else.
    If the test plan involves special auth requirements (e.g., one account per test, multiple roles), confirm the auth strategy with the user before proceeding.
跳过已完成的步骤(项目已存在、依赖已安装、认证已配置)。
  1. 配置AI提供商 — 检查测试项目是否已有包含AI API密钥的
    .env
    文件。如果没有,询问用户选择提供商:
    要运行YAML测试用例,需要AI提供商来解析测试步骤。你想使用哪个提供商?
    A) Google AI — 需要
    GOOGLE_API_KEY
    获取密钥)——默认模型:
    gemini-3.1-flash-lite-preview
    B) Anthropic — 需要
    ANTHROPIC_API_KEY
    获取密钥)——默认模型:
    claude-haiku-4-5
    C) OpenAI — 需要
    OPENAI_API_KEY
    获取密钥)——默认模型:
    gpt-5.4-mini
    D) Azure OpenAI — 需要
    AZURE_OPENAI_API_KEY
    +
    AZURE_OPENAI_ENDPOINT
    — 设置
    WEB_AGENT_MODEL=azure:<deployment>
    E) AWS Bedrock — 使用AWS凭证链 — 设置
    WEB_AGENT_MODEL=bedrock:<model_id>
    F) Google Vertex AI — 使用GCP应用默认凭证 — 设置
    WEB_AGENT_MODEL=vertex:<model>
    G) 我已完成配置
    用户选择后,询问其API密钥并保存到测试项目的
    .env
    文件中。对于A/B/C选项,模型会通过密钥自动识别。对于D/E/F选项,还需保存带有相应
    provider:model
    前缀的
    WEB_AGENT_MODEL
    。用户也可选择设置
    WEB_AGENT_MODEL
    来覆盖默认模型(例如:
    WEB_AGENT_MODEL=claude-sonnet-4-6
    )。
  2. 搭建项目骨架 — 调用
    scaffold_project
    并传入项目绝对路径。这将创建
    package.json
    playwright.config.ts
    .env.example
    .gitignore
    tests/
    目录。将API密钥保存到
    .env
    文件中。
  3. 安装依赖
    bash
    npm install
    npx playwright install chromium
  4. 配置认证(若需要) — 遵循标准的Playwright认证模式
    playwright.config.ts
    中添加凭证变量:
    ts
    {
      name: 'my-app',
      testDir: './tests/my-app',
      dependencies: ['my-app-setup'],
      use: {
        baseURL: 'https://app.example.com',
        storageState: 'tests/my-app/.auth/storage-state.json',
        variables: {
          username: process.env.MY_APP_EMAIL,
          password: { value: process.env.MY_APP_PASSWORD, sensitive: true },
          // otp_secret_key: { value: process.env.MY_APP_TOTP_SECRET, sensitive: true },
        },
      },
    },
    标准变量名:
    username
    password
    otp_secret_key
    。对于敏感信息,使用
    { value, sensitive: true }
    格式。将值添加到
    .env
    文件中。
    编写
    auth.setup.ts
    ,包含标准Playwright登录代码。对于TOTP,使用
    node:crypto
    实现RFC 6238(HMAC-SHA1 + base32解码)——无需第三方依赖。
    继续前验证认证有效性。运行
    npx shiplight test --headed
    执行认证配置,确认已保存
    storage-state.json
    。若失败,告知用户——认证是后续所有操作的前提。
    如果测试计划涉及特殊认证需求(例如:每个测试使用独立账号、多角色),继续前需与用户确认认证策略。

Write tests

编写测试用例

For each test in the plan (or each test the user wants):
  1. Open a browser session — call
    new_session
    with the app's
    starting_url
    .
  2. Walk through the flow — use
    inspect_page
    to see the page, then
    act
    to perform each action. This captures locators from the response.
  3. Capture locators — use
    get_locators
    for additional element info when needed.
  4. Build the YAML — construct the
    .test.yaml
    content following the best practices below.
  5. Save and validate — write the
    .test.yaml
    file, then call
    validate_yaml_test
    with the file path to check locator coverage (minimum 50% required).
  6. Close the session — call
    close_session
    when done.
Important: Do NOT write YAML tests from imagination. Always walk through the app in a browser session first to capture real locators. Tests without locators are rejected by
validate_yaml_test
.
When guided by
test-plan.md
:
  • Apply the specified wait strategy at loading points
  • Cover the edge cases and assertions defined in the spec
针对计划中的每个测试(或用户需要的每个测试):
  1. 打开浏览器会话 — 调用
    new_session
    并传入应用的
    starting_url
  2. 遍历流程 — 使用
    inspect_page
    查看页面,然后使用
    act
    执行每个操作。这将从响应中捕获定位器。
  3. 捕获定位器 — 必要时使用
    get_locators
    获取额外元素信息。
  4. 构建YAML内容 — 按照以下最佳实践构建
    .test.yaml
    内容。
  5. 保存并验证 — 写入
    .test.yaml
    文件,然后调用
    validate_yaml_test
    并传入文件路径检查定位器覆盖率(要求至少50%)。
  6. 关闭会话 — 完成后调用
    close_session
重要提示: 切勿凭空编写YAML测试用例。必须先在浏览器会话中遍历应用,捕获真实的定位器。没有定位器的测试用例会被
validate_yaml_test
拒绝。
遵循
test-plan.md
指导:
  • 在加载点应用指定的等待策略
  • 覆盖规范中定义的边缘场景和断言

Run tests

运行测试

After writing all tests, run them:
bash
npx shiplight test --headed
When a test fails:
  1. Report — tell the user which test failed and why (one sentence).
  2. Classify the failure:
    • Implementation fix (wrong locator, missing wait, timing) → fix and retry.
    • Spec mismatch (app behavior differs from spec) → ask the user whether to update the spec or skip the scenario.
  3. Escalate if a fix doesn't work — don't keep retrying the same approach.

编写完所有测试用例后,运行测试:
bash
npx shiplight test --headed
测试失败时:
  1. 上报 — 告知用户哪个测试失败及原因(一句话说明)。
  2. 分类失败类型
    • 实现问题(定位器错误、缺少等待、时序问题)→ 修复后重试。
    • 规范不匹配(应用行为与规范不符)→ 询问用户是更新规范还是跳过该场景。
  3. 若修复无效则上报 — 切勿重复使用相同方法重试。

Phase 5: Verify

阶段5:验证

Goal: Validate test coverage against the spec and reconcile any drift.
Input: reads
test-specs/test-spec.md
,
test-specs/test-plan.md
, and all
.test.yaml
files
This phase only runs when spec artifacts exist.
目标:验证测试用例对规范的覆盖率,协调任何偏差。
输入:读取
test-specs/test-spec.md
test-specs/test-plan.md
以及所有
.test.yaml
文件
仅当存在规范产物时才运行此阶段。

Coverage check

覆盖率检查

For each spec journey, confirm the test covers the happy path and all listed edge cases.
Present a coverage summary:
Spec JourneyPriorityScenarios SpecifiedTests WrittenCoverage
User signupP044
CheckoutP032✗ — edge case "empty cart" not covered
Flag gaps and extras (test steps not in the spec).
针对每个规范流程,确认测试用例覆盖了正常流程和所有列出的边缘场景。
展示覆盖率汇总:
规范流程优先级指定场景数已编写测试数覆盖率
用户注册P044
结账流程P032✗ — 未覆盖边缘场景“空购物车”
标记覆盖率缺口和额外内容(规范中未提及的测试步骤)。

Reconcile

协调偏差

Update spec artifacts to match what was actually implemented:
  1. Update
    test-spec.md
    — mark skipped scenarios with reason, add scenarios that emerged during implementation, update edge cases to reflect what was tested
  2. Update
    test-plan.md
    — correct file structure, note deviations from the original plan
  3. Show diff summary — tell the user what changed and why
This keeps artifacts accurate for future test maintenance and expansion.

更新规范产物以匹配实际实现内容:
  1. 更新
    test-spec.md
    — 标记跳过的场景及原因,添加实施过程中出现的新场景,更新边缘场景以反映实际测试内容
  2. 更新
    test-plan.md
    — 修正文件结构,记录与原计划的偏差
  3. 展示差异汇总 — 告知用户变更内容及原因
这将确保产物准确,便于未来测试维护和扩展。

YAML Format Reference

YAML格式参考

Read the MCP resource
shiplight://yaml-test-spec-v1.3.0
for the full language spec (statement types, templates, variables, suites, hooks, parameterized tests).
Read the MCP resource
shiplight://schemas/action-entity
for the full list of available actions and their parameters.
阅读MCP资源
shiplight://yaml-test-spec-v1.3.0
获取完整语言规范(语句类型、模板、变量、套件、钩子、参数化测试)。
阅读MCP资源
shiplight://schemas/action-entity
获取可用操作及其参数的完整列表。

YAML Authoring Best Practices

YAML编写最佳实践

These best practices bridge the YAML language spec and the action catalog to help you write fast, reliable tests.
这些最佳实践结合了YAML语言规范和操作目录,帮助你编写快速、可靠的测试用例。

Statement type selection

语句类型选择

  • ACTION is the default. Capture locators via MCP tools (
    act
    ,
    get_locators
    ) during browser sessions, then write ACTION statements. ACTIONs replay deterministically (~1s).
  • DRAFT is a last resort. Only use DRAFT when the locator is genuinely unknowable at authoring time. DRAFTs are slow (~5-10s each, AI resolution at runtime). Tests with too many DRAFTs are rejected by
    validate_yaml_test
    .
  • VERIFY for assertions. Use
    VERIFY:
    for all assertions. Do not write assertion DRAFTs like
    "Check that the button is visible"
    .
  • URL for navigation. Use
    URL: /path
    for navigation instead of
    action: go_to_url
    .
  • CODE for scripting. Use
    CODE:
    for network mocking, localStorage manipulation, page-level scripting. Not for clicks, assertions, or navigation.
  • ACTION为默认类型:在浏览器会话期间通过MCP工具(
    act
    get_locators
    )捕获定位器,然后编写ACTION语句。ACTION语句可确定性重放(约1秒)。
  • DRAFT为最后选择:仅当编写时确实无法获取定位器时才使用DRAFT。DRAFT运行缓慢(每个约5-10秒,运行时需AI解析)。包含过多DRAFT的测试用例会被
    validate_yaml_test
    拒绝。
  • VERIFY用于断言:所有断言均使用
    VERIFY:
    。切勿编写类似
    "Check that the button is visible"
    的DRAFT断言。
  • URL用于导航:使用
    URL: /path
    进行导航,而非
    action: go_to_url
  • CODE用于脚本:使用
    CODE:
    进行网络模拟、localStorage操作、页面级脚本编写。不用于点击、断言或导航。

The
intent
field

intent
字段

intent
is the intent of the step — it defines what the step should accomplish. The
action
/
locator
or
js
fields are caches of how to do it. When a cache fails (stale locator, changed DOM), the AI agent uses
intent
to re-inspect the page and regenerate the action from scratch.
Because
intent
drives self-healing, it must be specific enough for an agent to act on without any other context. Describe the user goal, not the DOM element — avoid element indices, CSS selectors, or positional references that break when the UI changes:
yaml
undefined
intent
是步骤的意图——定义步骤应完成的目标。
action
/
locator
js
字段是缓存——记录实现方式。当缓存失效(定位器过期、DOM变更)时,AI代理会使用
intent
重新检查页面并重新生成操作。
由于
intent
驱动自我修复,其必须足够具体,使代理无需其他上下文即可执行操作。描述用户目标,而非DOM元素——避免使用元素索引、CSS选择器或位置引用,这些内容会随UI变更而失效:
yaml
undefined

BAD: vague, agent can't re-derive the action

错误:表述模糊,代理无法重新推导操作

  • intent: Click button
  • intent: Click button

BAD: tied to DOM structure that can change

错误:依赖可能变更的DOM结构

  • intent: Click the 3rd button in the form
  • intent: Click element at index 42
  • intent: Click the 3rd button in the form
  • intent: Click element at index 42

GOOD: describes the user goal, stable across UI changes

正确:描述用户目标,在UI变更时保持稳定

  • intent: Click the Submit button to save the new project action: click locator: "getByRole('button', { name: 'Submit' })"
undefined
  • intent: Click the Submit button to save the new project action: click locator: "getByRole('button', { name: 'Submit' })"
undefined

ACTION: structured format vs
js:
shorthand

ACTION:结构化格式 vs
js:
简写

Use structured format by default for all supported actions. Read the MCP resource
shiplight://schemas/action-entity
for the full list of available actions and their parameters.
Use
js:
only when the action doesn't map to a supported action
— e.g., complex multi-step interactions, custom Playwright API calls, or chained operations:
yaml
- intent: Drag slider to 50% position
  js: "await page.getByRole('slider').first().fill('50')"

- intent: Wait for network idle after form submit
  js: "await page.waitForLoadState('networkidle')"
默认使用结构化格式处理所有支持的操作。阅读MCP资源
shiplight://schemas/action-entity
获取可用操作及其参数的完整列表。
仅当操作无法映射到支持的类型时使用
js:
——例如复杂的多步骤交互、自定义Playwright API调用或链式操作:
yaml
- intent: Drag slider to 50% position
  js: "await page.getByRole('slider').first().fill('50')"

- intent: Wait for network idle after form submit
  js: "await page.waitForLoadState('networkidle')"

js:
coding rules

js:
编码规则

  • Always resolve locators to a single element (e.g.,
    .first()
    ,
    .nth(1)
    ) to avoid Playwright strict-mode errors
  • Always include
    { timeout: 5000 }
    on actions for predictable timing
  • The
    intent
    is critical — it's the input for self-healing when
    js
    fails
  • page
    ,
    agent
    , and
    expect
    are available in scope
  • 始终将定位器解析为单个元素(例如:
    .first()
    .nth(1)
    ),避免Playwright严格模式错误
  • 操作始终包含
    { timeout: 5000 }
    ,确保时序可预测
  • intent
    至关重要——当
    js
    失效时,它是自我修复的输入
  • page
    agent
    expect
    在作用域内可用

VERIFY best practices

VERIFY最佳实践

  • Always set a short timeout (e.g.,
    { timeout: 2000 }
    ) on
    js:
    assertions that have an AI fallback, so stale locators fall back to AI quickly instead of waiting the default 5s
  • Always use
    VERIFY:
    shorthand — do not use
    action: verify
    directly
  • Be aware of false negatives with
    js:
    assertions.
    The AI fallback only triggers when
    js
    throws (element not found, timeout). If
    js
    passes against the wrong element (stale selector matching a different element), the assertion silently succeeds — no fallback occurs. Keep
    js:
    assertions simple and specific to minimize this risk.
  • 对于带有AI回退的
    js:
    断言,始终设置短超时(例如:
    { timeout: 2000 }
    ),这样过期定位器会快速回退到AI,而非等待默认的5秒
  • 始终使用
    VERIFY:
    简写——切勿直接使用
    action: verify
  • 注意
    js:
    断言的误报
    。AI回退仅在
    js
    抛出异常(元素未找到、超时)时触发。如果
    js
    匹配到错误元素(过期选择器匹配到其他元素)并通过,断言会静默成功——不会触发回退。保持
    js:
    断言简单且具体,以最小化此风险。

IF/WHILE
js:
condition best practices

IF/WHILE
js:
条件最佳实践

  • Use natural language (AI) conditions for DOM-based checks (element visible, text present, page state). AI conditions self-heal against DOM changes;
    js:
    conditions are brittle and cannot auto-heal.
  • Use
    js:
    conditions only for counter/state logic
    — e.g.,
    js: counter++ < 10
    ,
    js: retryCount < 3
    . Never use
    js:
    for DOM inspection like
    js: document.querySelector('.modal') !== null
    .
  • If you need a JavaScript-based DOM check, use
    CODE:
    to evaluate it and store the result, or use
    VERIFY:
    with
    js:
    (which at least has AI fallback on failure).
  • 针对基于DOM的检查使用自然语言(AI)条件(元素可见、文本存在、页面状态)。AI条件可针对DOM变更进行自我修复;
    js:
    条件易失效且无法自动修复。
  • 仅将
    js:
    条件用于计数器/状态逻辑
    ——例如:
    js: counter++ < 10
    js: retryCount < 3
    。切勿使用
    js:
    进行DOM检查,如
    js: document.querySelector('.modal') !== null
  • 如果需要基于JavaScript的DOM检查,使用
    CODE:
    进行评估并存储结果,或使用带有
    js:
    VERIFY:
    (至少在失败时有AI回退)。

Waiting syntax

等待语法

  • WAIT_UNTIL:
    — AI checks the condition repeatedly until met or timeout. Default timeout is 60 seconds. Each AI check takes 5–10s, so set
    timeout_seconds
    to at least 15.
  • WAIT:
    — fixed-duration pause. Use
    seconds:
    to set duration.
See Smart waiting in E2E Test Design for when to use each.
  • WAIT_UNTIL:
    — AI反复检查条件,直到满足或超时。默认超时为60秒。每次AI检查需5–10秒,因此
    timeout_seconds
    至少设置为15。
  • WAIT:
    — 固定时长等待。使用
    seconds:
    设置时长。
有关何时使用哪种等待方式,请参阅E2E测试设计中的智能等待部分。

General conventions

通用约定

  • Put
    intent
    first in ACTION statements for readability
  • xpath
    is only needed when an ACTION has neither
    locator
    nor
    js
    .
  • Single-test vs Suite vs Parameters:
    • Single-test file — one isolated test, no shared state
    • Suite — tests that have sequential dependencies (e.g., test A creates a file, test B consumes it). Each test in a suite still covers one journey — the suite just guarantees execution order and shares browser state. Do NOT use suites to bundle unrelated tests.
    • Parameters — same test structure, different data inputs
  • ACTION语句中
    intent
    放在首位,提高可读性
  • 仅当ACTION既无
    locator
    也无
    js
    时才需要
    xpath
  • 单测试用例 vs 套件 vs 参数化:
    • 单测试文件 — 一个独立测试,无共享状态
    • 套件 — 存在顺序依赖的测试(例如:测试A创建文件,测试B使用该文件)。套件中的每个测试仍覆盖一个流程——套件仅保证执行顺序并共享浏览器状态。切勿使用套件捆绑无关测试。
    • 参数化 — 测试结构相同,数据输入不同

E2E Test Design Best Practices

E2E测试设计最佳实践

These principles govern what to test and how to structure tests — independent of the YAML format. Apply them during Phase 2 (Specify) and Phase 4 (Implement).
这些原则指导测试内容测试结构——与YAML格式无关。在阶段2(规范)和阶段4(实施)中应用这些原则。

Test isolation

测试隔离

Each test must run independently — never depend on another test's side effects, execution order, or leftover state. If a test needs data, it creates that data itself.
yaml
undefined
每个测试必须独立运行——绝不依赖其他测试的副作用、执行顺序或遗留状态。如果测试需要数据,应自行创建该数据。
yaml
undefined

BAD: depends on a previous test having created "My Project"

错误:依赖之前的测试已创建"My Project"

test: Delete a project steps:
  • URL: /projects
  • intent: Click on "My Project" action: click locator: "getByText('My Project')"
  • intent: Click the Delete button action: click locator: "getByRole('button', { name: 'Delete' })"
test: Delete a project steps:
  • URL: /projects
  • intent: Click on "My Project" action: click locator: "getByText('My Project')"
  • intent: Click the Delete button action: click locator: "getByRole('button', { name: 'Delete' })"

GOOD: creates its own data, then tests the behavior

正确:自行创建数据,然后测试行为

test: Delete a project steps:
  • CODE: js: | const res = await page.request.post('/api/projects', { data: { name: 'Delete-Test-' + Date.now() } }); const project = await res.json(); save_variable('projectName', project.name);
  • URL: /projects
  • WAIT_UNTIL: The project list has loaded
  • intent: Click on the project we just created action: click js: "await page.getByText('{{projectName}}').click()"
  • intent: Click the Delete button action: click locator: "getByRole('button', { name: 'Delete' })"
  • VERIFY: The project is no longer visible in the list
undefined
test: Delete a project steps:
  • CODE: js: | const res = await page.request.post('/api/projects', { data: { name: 'Delete-Test-' + Date.now() } }); const project = await res.json(); save_variable('projectName', project.name);
  • URL: /projects
  • WAIT_UNTIL: The project list has loaded
  • intent: Click on the project we just created action: click js: "await page.getByText('{{projectName}}').click()"
  • intent: Click the Delete button action: click locator: "getByRole('button', { name: 'Delete' })"
  • VERIFY: The project is no longer visible in the list
undefined

One journey per test

每个测试覆盖一个流程

Each test should verify one logical user journey. If step 3 of 8 fails, steps 4-8 give you zero information. Split long flows into focused tests.
Exception: Suites allow sequential dependencies between tests (e.g., test A uploads a file, test B downloads it). Each test in a suite still covers one journey — the suite just guarantees order and shares browser state.
yaml
undefined
每个测试应验证一个逻辑用户流程。如果8步中的第3步失败,第4-8步无法提供有效信息。将长流程拆分为聚焦的测试。
例外: 套件允许测试之间存在顺序依赖(例如:测试A上传文件,测试B下载该文件)。套件中的每个测试仍覆盖一个流程——套件仅保证顺序并共享浏览器状态。
yaml
undefined

BAD: tests login, settings change, AND deletion in one test

错误:在一个测试中覆盖登录、设置变更和账号删除

test: Full user lifecycle steps:
  • intent: Log in
  • intent: Navigate to settings
  • intent: Change display name
  • VERIFY: Name updated
  • intent: Navigate to account
  • intent: Delete account
  • VERIFY: Account deleted
test: Full user lifecycle steps:
  • intent: Log in
  • intent: Navigate to settings
  • intent: Change display name
  • VERIFY: Name updated
  • intent: Navigate to account
  • intent: Delete account
  • VERIFY: Account deleted

GOOD: separate tests, each verifiable in isolation

正确:拆分测试,每个测试可独立验证

File: update-display-name.test.yaml

文件:update-display-name.test.yaml

test: Update display name from settings steps:
  • URL: /settings
  • intent: Clear the display name field and type "New Name" action: fill locator: "getByLabel('Display name')" value: "New Name"
  • intent: Click Save action: click locator: "getByRole('button', { name: 'Save' })"
  • VERIFY: Success message "Settings saved" is visible
test: Update display name from settings steps:
  • URL: /settings
  • intent: Clear the display name field and type "New Name" action: fill locator: "getByLabel('Display name')" value: "New Name"
  • intent: Click Save action: click locator: "getByRole('button', { name: 'Save' })"
  • VERIFY: Success message "Settings saved" is visible

File: delete-account.test.yaml (separate test)

文件:delete-account.test.yaml (独立测试)

test: Delete account from account page steps:
  • URL: /account

... focused on deletion only

undefined
test: Delete account from account page steps:
  • URL: /account

... 仅聚焦于删除流程

undefined

Assert what users see, not implementation details

断言用户可见内容,而非实现细节

Test visible outcomes — text, navigation, enabled/disabled states. Never assert CSS classes, data attributes, internal state, or DOM structure.
yaml
undefined
测试可见结果——文本、导航、启用/禁用状态。切勿断言CSS类、数据属性、内部状态或DOM结构。
yaml
undefined

BAD: asserts implementation details

错误:断言实现细节

  • VERIFY: js: | const el = await page.locator('.btn-primary'); await expect(el).toHaveClass(/disabled/); await expect(el).toHaveAttribute('data-state', 'submitted');
  • VERIFY: js: | const el = await page.locator('.btn-primary'); await expect(el).toHaveClass(/disabled/); await expect(el).toHaveAttribute('data-state', 'submitted');

GOOD: asserts what a user would observe

正确:断言用户可观察到的内容

  • VERIFY: The Submit button is disabled js: | await expect(page.getByRole('button', { name: 'Submit' })) .toBeDisabled({ timeout: 2000 });
undefined
  • VERIFY: The Submit button is disabled js: | await expect(page.getByRole('button', { name: 'Submit' })) .toBeDisabled({ timeout: 2000 });
undefined

Focused assertions

聚焦断言

Verify the one thing that proves the feature works. Over-asserting makes tests brittle — they break on cosmetic changes unrelated to the behavior under test.
yaml
undefined
验证证明功能正常工作的核心点。过度断言会导致测试脆弱——无关行为的 cosmetic变更会导致测试失败。
yaml
undefined

BAD: asserts every field on the page — breaks when any label changes

错误:断言页面上的每个字段——任何标签变更都会导致测试失败

  • VERIFY: Page title is "Dashboard"
  • VERIFY: Welcome message shows username
  • VERIFY: Sidebar has 5 menu items
  • VERIFY: Footer shows current year
  • VERIFY: Avatar image is loaded
  • VERIFY: Notification bell is visible
  • VERIFY: Page title is "Dashboard"
  • VERIFY: Welcome message shows username
  • VERIFY: Sidebar has 5 menu items
  • VERIFY: Footer shows current year
  • VERIFY: Avatar image is loaded
  • VERIFY: Notification bell is visible

GOOD: asserts the one thing that proves the user landed on the dashboard

正确:断言证明用户已进入仪表盘的核心点

  • VERIFY: Dashboard page shows the welcome message with the user's name
undefined
  • VERIFY: Dashboard page shows the welcome message with the user's name
undefined

Never test third-party services

绝不测试第三方服务

Don't assert that Stripe's checkout, Google OAuth's consent screen, or Twilio's SMS delivery works. Mock external services at the network boundary. Test your integration, not their UI.
yaml
undefined
不要断言Stripe结账、Google OAuth授权界面或Twilio短信送达是否正常工作。在网络边界模拟外部服务。测试你的集成,而非他们的UI。
yaml
undefined

BAD: tests Stripe's UI (will break when Stripe updates their page)

错误:测试Stripe的UI(Stripe更新页面时会失败)

  • intent: Enter card number in Stripe iframe
  • intent: Click Stripe's pay button
  • VERIFY: Stripe shows success checkmark
  • intent: Enter card number in Stripe iframe
  • intent: Click Stripe's pay button
  • VERIFY: Stripe shows success checkmark

GOOD: mock the payment API, test your success handling

正确:模拟支付API,测试你的成功处理逻辑

  • CODE: js: | await page.route('**/api/payments', route => route.fulfill({ status: 200, json: { status: 'succeeded', id: 'pi_mock' } }) );
  • intent: Click the Pay button action: click locator: "getByRole('button', { name: 'Pay' })"
  • VERIFY: Order confirmation page shows "Payment successful"
undefined
  • CODE: js: | await page.route('**/api/payments', route => route.fulfill({ status: 200, json: { status: 'succeeded', id: 'pi_mock' } }) );
  • intent: Click the Pay button action: click locator: "getByRole('button', { name: 'Pay' })"
  • VERIFY: Order confirmation page shows "Payment successful"
undefined

Deterministic test data

确定性测试数据

Use unique identifiers per test run to avoid collisions. Never rely on hardcoded data that other tests or users might modify.
yaml
undefined
每次测试运行使用唯一标识符,避免冲突。绝不依赖其他测试或用户可能修改的硬编码数据。
yaml
undefined

BAD: hardcoded name — collides if tests run in parallel or data persists

错误:硬编码名称——并行测试或数据持久化时会冲突

  • intent: Type "Test User" into the name field action: fill locator: "getByLabel('Name')" value: "Test User"
  • intent: Type "Test User" into the name field action: fill locator: "getByLabel('Name')" value: "Test User"

GOOD: unique per run — no collisions

正确:每次运行唯一——无冲突

  • CODE: js: "save_variable('testName', 'Test-User-' + Date.now());"
  • intent: Type the generated name into the name field action: fill locator: "getByLabel('Name')" text: "{{testName}}"
undefined
  • CODE: js: "save_variable('testName', 'Test-User-' + Date.now());"
  • intent: Type the generated name into the name field action: fill locator: "getByLabel('Name')" text: "{{testName}}"
undefined

Prefer API seeding over UI setup

优先使用API初始化数据,而非UI设置

When a test needs preconditions (a user exists, a project is created), set them up via API calls — not by clicking through the UI. UI setup is slow, flaky, and not what you're testing.
yaml
undefined
当测试需要前置条件(用户已存在、项目已创建)时,通过API调用设置——而非点击UI。UI设置速度慢、易失效,且不是测试的核心内容。
yaml
undefined

BAD: 10 UI steps just to set up data before the real test

错误:10步UI操作只是为了在真正测试前初始化数据

  • URL: /projects/new
  • intent: Type project name
  • intent: Select team
  • intent: Click Create
  • WAIT_UNTIL: Project page loads
  • URL: /projects/new
  • intent: Type project name
  • intent: Select team
  • intent: Click Create
  • WAIT_UNTIL: Project page loads

... now the actual test starts

... 真正的测试现在才开始

GOOD: API seed in one step, then test the real behavior

正确:一步API初始化,然后测试核心行为

  • CODE: js: | const res = await page.request.post('/api/projects', { data: { name: 'Seed-' + Date.now(), team: 'engineering' } }); const { slug } = await res.json(); save_variable('projectSlug', slug);
  • URL: /projects/{{projectSlug}}/settings
  • WAIT_UNTIL: Settings page has loaded
  • CODE: js: | const res = await page.request.post('/api/projects', { data: { name: 'Seed-' + Date.now(), team: 'engineering' } }); const { slug } = await res.json(); save_variable('projectSlug', slug);
  • URL: /projects/{{projectSlug}}/settings
  • WAIT_UNTIL: Settings page has loaded

... test starts immediately at the point that matters

... 测试直接从关键节点开始

undefined
undefined

Smart waiting

智能等待

Use the right wait for the situation.
WAIT_UNTIL:
costs 5-10s per check (AI resolution), so it's overkill for short, predictable delays.
WAIT:
is fine when the delay is short and known. The anti-pattern is using
WAIT:
as a substitute for condition-based waiting when the delay is unpredictable.
yaml
undefined
针对不同场景使用合适的等待方式。
WAIT_UNTIL:
每次检查需5-10秒(AI解析),因此对于短且可预测的延迟来说过于冗余。
WAIT:
适用于短且已知的延迟。反模式是当延迟不可预测时,使用
WAIT:
替代基于条件的等待。
yaml
undefined

BAD: guessing how long a data fetch takes — too short in CI, too long locally

错误:猜测数据加载时间——CI环境中太短,本地环境太长

  • WAIT: Wait for data to load seconds: 5
  • VERIFY: The table shows results
  • WAIT: Wait for data to load seconds: 5
  • VERIFY: The table shows results

GOOD: condition-based wait for unpredictable operations

正确:针对不可预测操作使用基于条件的等待

  • WAIT_UNTIL: The data table has at least one row visible timeout_seconds: 30
  • WAIT_UNTIL: The data table has at least one row visible timeout_seconds: 30

ALSO GOOD: short WAIT for known, fast delays (animations, transitions, debounce)

同样正确:针对已知的快速延迟(动画、过渡、防抖)使用短WAIT

  • intent: Type search query action: fill locator: "getByRole('searchbox')" value: "test"
  • WAIT: Wait for debounce to fire seconds: 1
  • VERIFY: Search suggestions are visible

Rule of thumb: if the delay is **predictable and under 5s** (animation, debounce, transition), use `WAIT:`. If the delay is **unpredictable** (API call, data loading, file processing), use `WAIT_UNTIL:`.
  • intent: Type search query action: fill locator: "getByRole('searchbox')" value: "test"
  • WAIT: Wait for debounce to fire seconds: 1
  • VERIFY: Search suggestions are visible

经验法则:如果延迟**可预测且小于5秒**(动画、防抖、过渡),使用`WAIT:`。如果延迟**不可预测**(API调用、数据加载、文件处理),使用`WAIT_UNTIL:`。

Test error states, not just happy paths

测试错误状态,而非仅正常流程

Real users hit errors. A test suite that only covers happy paths gives false confidence. For every critical journey, include at least one error/edge case test.
yaml
undefined
真实用户会遇到错误。仅覆盖正常流程的测试套件会给出虚假的信心。对于每个关键流程,至少包含一个错误/边缘场景测试。
yaml
undefined

Covers: empty state, invalid input, network failure

覆盖:空状态、无效输入、网络故障

test: Search handles no results gracefully steps:
  • URL: /search
  • intent: Type a query that returns no results action: fill locator: "getByRole('searchbox')" value: "zzz_no_match_zzz"
  • intent: Submit the search action: click locator: "getByRole('button', { name: 'Search' })"
  • VERIFY: Empty state message "No results found" is displayed
  • VERIFY: The search box still contains the query (user can refine)
undefined
test: Search handles no results gracefully steps:
  • URL: /search
  • intent: Type a query that returns no results action: fill locator: "getByRole('searchbox')" value: "zzz_no_match_zzz"
  • intent: Submit the search action: click locator: "getByRole('button', { name: 'Search' })"
  • VERIFY: Empty state message "No results found" is displayed
  • VERIFY: The search box still contains the query (user can refine)
undefined

Design for parallel execution

为并行执行设计测试

Tests that modify shared global state (e.g., site-wide settings, the only admin account) can't safely run in parallel. Design around this:
  • Use unique, per-test data instead of shared fixtures
  • Avoid tests that change global configuration
  • If a test must modify shared state, document it and mark it for serial execution
修改共享全局状态(例如:站点范围设置、唯一管理员账号)的测试无法安全并行运行。需针对性设计:
  • 使用唯一的单测试数据,而非共享固定数据
  • 避免修改全局配置的测试
  • 如果测试必须修改共享状态,需记录并标记为串行执行

Flaky test policy

不稳定测试处理策略

A test that passes on retry is still broken. Never add retries to mask flakiness — find and fix the root cause:
  • Timing flake? → Add a proper
    WAIT_UNTIL:
    for the right condition
  • Data flake? → Use unique test data, add proper cleanup
  • Order flake? → The test has a hidden dependency on another test — make it self-contained
  • Environment flake? → Mock the unstable external service
重试后通过的测试仍然是有问题的。切勿添加重试来掩盖不稳定性——找到并修复根本原因:
  • 时序不稳定? → 为正确的条件添加合适的
    WAIT_UNTIL:
  • 数据不稳定? → 使用唯一测试数据,添加适当的清理逻辑
  • 顺序不稳定? → 测试存在对其他测试的隐藏依赖——使其独立
  • 环境不稳定? → 模拟不稳定的外部服务

Project Structure

项目结构

my-tests/
├── test-specs/                   # Spec artifacts (version-controlled)
│   ├── test-strategy.md          # Phase 1: app & risk profile
│   ├── test-spec.md              # Phase 2: Given/When/Then scenarios
│   └── test-plan.md              # Phase 3: implementation plan
├── playwright.config.ts
├── package.json
├── .env                          # API keys + credentials (gitignored)
├── .gitignore
├── tests/
│   ├── public-app/               # No login needed
│   │   ├── search.test.yaml
│   │   └── filter.test.yaml
│   │
│   └── my-saas-app/              # Requires login
│       ├── auth.setup.ts         # Playwright login setup — you write this
│       ├── dashboard.test.yaml
│       └── settings.test.yaml
The
test-specs/
directory contains human-readable markdown artifacts that are version-controllable. Do NOT add
test-specs/
to
.gitignore
.
my-tests/
├── test-specs/                   # 规范产物(需版本控制)
│   ├── test-strategy.md          # 阶段1:应用与风险概况
│   ├── test-spec.md              # 阶段2:Given/When/Then场景
│   └── test-plan.md              # 阶段3:实施计划
├── playwright.config.ts
├── package.json
├── .env                          # API密钥 + 凭证(已加入git忽略)
├── .gitignore
├── tests/
│   ├── public-app/               # 无需登录
│   │   ├── search.test.yaml
│   │   └── filter.test.yaml
│   │
│   └── my-saas-app/              # 需要登录
│       ├── auth.setup.ts         # Playwright登录配置——由你编写
│       ├── dashboard.test.yaml
│       └── settings.test.yaml
test-specs/
目录包含人类可读的Markdown产物,需纳入版本控制。请勿将
test-specs/
添加到
.gitignore

Tips

小贴士

  • ACTION statements with locators replay ~10x faster than DRAFTs. Always prefer ACTIONs.
  • Use
    inspect_page
    to understand page state. Always read the DOM file first — it provides element indices needed for
    act
    and consumes far fewer tokens. Only view the screenshot when you specifically need visual information (layout, colors, images), as screenshots consume significantly more tokens than DOM.
  • Run a specific project's tests with:
    npx shiplight test --headed my-saas-app/
  • The
    .env
    file is auto-discovered by
    shiplightConfig()
    — no manual dotenv setup needed.
  • 带有定位器的ACTION语句重放速度比DRAFT快约10倍。始终优先使用ACTION。
  • 使用
    inspect_page
    了解页面状态。始终先读取DOM文件——它提供
    act
    所需的元素索引,且消耗的token远少于截图。仅当特别需要视觉信息(布局、颜色、图片)时才查看截图,因为截图消耗的token远多于DOM。
  • 运行特定项目的测试:
    npx shiplight test --headed my-saas-app/
  • .env
    文件会被
    shiplightConfig()
    自动识别——无需手动配置dotenv。