ui-verification

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Overview

概述

UI Verification covers two parallel modes against a live web app:
  1. Visual verification — checks whether the page matches its design specification. Translates design claims into CSS rule checks, runs them against the live DOM via
    getComputedStyle()
    . Deterministic; the browser's computed styles are the source of truth.
  2. Flow verification — checks whether user journeys complete correctly. Executes Gherkin scenarios from
    .feature
    files via Nova Act's
    act()
    (actions) and
    act_get()
    (assertions). Non-deterministic; results vary run-to-run with network timing and live UI shifts.
Both modes share the skill, the MCP server, and the browser session. A single run can produce both kinds of output, combined into one report. Or either mode can run alone; the other section is omitted from the combined report.
Each run produces:
  1. Structured artifacts — per-category JSON for visual; per-flow execution data for flows
  2. Annotated screenshots — red bounding boxes highlighting visual failures on the page
  3. Verification report — markdown combining a visual summary, a flow summary (per-flow status table), and links to per-flow detail reports
The verify_* tools are deterministic — no vision model, browser's computed styles are the source of truth. The compile and audit passes for visual are LLM-driven (best-effort), reconciling design intent against the app's actual structure. Flow steps are interpreted by Nova Act each run; flow runs are inherently non-deterministic.
UI验证针对在线Web应用提供两种并行验证模式:
  1. 视觉验证 — 检查页面是否符合设计规范。将设计要求转化为CSS规则检查,通过
    getComputedStyle()
    针对实时DOM运行。结果具有确定性;浏览器计算出的样式为判定依据。
  2. 流程验证 — 检查用户旅程是否能正确完成。通过Nova Act的
    act()
    (操作)和
    act_get()
    (断言)执行
    .feature
    文件中的Gherkin场景。结果不具有确定性;网络时序和实时UI变化会导致每次运行结果不同。
两种模式共用该skill、MCP服务器和浏览器会话。单次运行可生成两种输出,并合并为一份报告。也可单独运行任意一种模式;合并报告中会省略未运行模式的相关章节。
每次运行会生成以下内容:
  1. 结构化产物 — 视觉验证的分类JSON数据;流程验证的单流程执行数据
  2. 标注截图 — 用红色边框高亮页面上的视觉验证失败项
  3. 验证报告 — 包含视觉验证摘要、流程验证摘要(单流程状态表)以及单流程详情报告链接的markdown文档
verify_*系列工具具有确定性 — 不依赖视觉模型,浏览器计算出的样式为判定依据。视觉验证的编译和审计环节由LLM驱动(尽力而为),用于协调设计意图与应用实际结构。流程步骤每次运行时由Nova Act解释;流程运行本质上具有不确定性。

Reconciliation inputs

协调输入项

Visual verification. The 5 compiled category files (
.ui-verification/specs/*.md
) are reconciled from up to three inputs:
InputWhat it providesRequired?
design.md
Free-form design intent — tokens, prose, design language, component definitionsRequired
The running appLive DOM observed via
start_browse
+
verify_*
— real selectors, real computed values
Required for verification (informs selectors and validates rules)
App source codeComponents, theme/tokens, CSS files — implementation truthOptional. When accessible, makes selectors deterministic and divergence classification more precise
Flow verification. Flows are authored or generated as
.feature
files at
.ui-verification/flows/
:
InputWhat it providesRequired?
.feature
files
Gherkin scenarios with metadata header (flow ID, type, app URL, optional auth and cleanup)Required for flow verification
The running appTarget of the scenarios; Nova Act executes against the live URLRequired
Auth credentialsProvided when a flow's
# user:
metadata declares a required login
Conditional
When source code is accessible (the user is a developer working on their own app), the agent should detect it and use it during compile, audit, and spec generation. Source can live at the project root (
package.json
+
src/
at
output_dir
) OR one level deeper (e.g.
output_dir/<package-name>/package.json
— a workspace layout where the verifier opens at the workspace root and one or more app packages live as subdirs). Sniff both. When source is not available (verifying an external site, black-box check), the skill operates in DOM-only mode — selectors are best-effort, audit relies on design.md + DOM only.
The 5 category files grow as the app develops:
  • design.md grows via Scribe (user expressing design intent in chat → design.md edits)
  • 5 mds grow as the Compiler discovers more verifiable surfaces in the app and source
Both layers can grow, but only via their own author paths. The 5 mds are NEVER edited to record observations or freeze divergent live-site values — see hard rules below.
视觉验证。5个已编译的分类文件(
.ui-verification/specs/*.md
)由最多三类输入项协调生成:
输入项提供内容是否必填?
design.md
自由格式的设计意图 — 设计令牌、描述文本、设计语言、组件定义必填
运行中的应用通过
start_browse
+
verify_*
观测到的实时DOM — 真实选择器、真实计算值
验证环节必填(用于确定选择器并验证规则)
应用源代码组件、主题/令牌、CSS文件 — 实现依据可选。当可访问时,能让选择器更具确定性,并让差异分类更精准
流程验证。流程以
.feature
文件形式编写或生成,存储于
.ui-verification/flows/
目录:
输入项提供内容是否必填?
.feature
文件
带有元数据头部(流程ID、类型、应用URL、可选的认证和清理步骤)的Gherkin场景流程验证必填
运行中的应用场景的目标对象;Nova Act针对实时URL执行场景必填
认证凭据当流程的
# user:
元数据声明需要登录时提供
可选(视情况而定)
当源代码可访问时(用户为开发人员,正在处理自己的应用),agent应检测到源代码并在编译、审计和规范生成过程中使用。源代码可位于项目根目录(
package.json
+
src/
output_dir
下)或更深一级目录(例如
output_dir/<package-name>/package.json
— 工作区布局,验证器在工作区根目录启动,一个或多个应用包作为子目录存在)。需同时检查这两种位置。当源代码不可访问时(验证外部站点,黑盒检查),skill将在仅DOM模式下运行 — 选择器为尽力而为生成,审计仅依赖design.md + DOM。
随着应用的开发,5个分类文件会逐步完善:
  • design.md通过Scribe工具完善(用户在聊天中表达设计意图 → 编辑design.md)
  • 5个md文件随着编译器在应用和源代码中发现更多可验证内容而逐步完善
两类文件均可扩展,但只能通过各自的编写路径进行。5个md文件绝不用于记录观测结果或固化站点的差异值 — 请遵守下文的硬性规则。

Three names — don't conflate them

三个名称 — 请勿混淆

NameWhat it identifies
ui-verification
This skill (the agent's playbook)
nova-act-mcp
The MCP server providing browser + verify_* tools
.ui-verification/
The artifact directory at the project root (compiled specs, assertion JSON, reports)
These are unrelated despite words overlapping. The skill does NOT live inside the artifact dir; the artifact dir is at the project root, not inside the skill.
名称标识内容
ui-verification
本skill(agent的操作手册)
nova-act-mcp
提供浏览器 + verify_*工具的MCP服务器
.ui-verification/
项目根目录下的产物目录(已编译规范、断言JSON、报告)
尽管名称存在重叠,但三者互不相关。skill并不位于产物目录内;产物目录位于项目根目录,而非skill内部。

Capabilities

功能列表

CapabilityToolSource FileWhat It Checks
Visual Style
verify_visual_style
specs/visual-style.md
Colors, typography, spacing, radii, shadows
Components
verify_components
specs/component-rules.md
Component presence, variants, props
Accessibility
verify_accessibility
specs/accessibility.md
Aria roles, landmarks, heading hierarchy
Project Rules
verify_project_rules
specs/project-rules.md
Layout structure, spacing system, conventions
Platform Conventions
verify_platform_conventions
specs/platform-conventions.md
Navigation patterns, page structure
User Flows
act
+
act_get
flows/<flow-name>.feature
End-to-end user journeys, functional correctness
Visual rules can be route-scoped: each category file may contain
## Scope: any
(default) and
## Scope: route=<glob>
sections. See
references/spec_authoring.md
. Flow scenarios target the URL declared in their
# app:
metadata; route scoping is per-flow rather than per-rule.
功能工具源文件检查内容
视觉样式
verify_visual_style
specs/visual-style.md
颜色、排版、间距、圆角、阴影
组件
verify_components
specs/component-rules.md
组件存在性、变体、属性
无障碍性
verify_accessibility
specs/accessibility.md
Aria角色、地标、标题层级
项目规则
verify_project_rules
specs/project-rules.md
布局结构、间距系统、约定
平台约定
verify_platform_conventions
specs/platform-conventions.md
导航模式、页面结构
用户流程
act
+
act_get
flows/<flow-name>.feature
端到端用户旅程、功能正确性
视觉规则可按路由范围限定:每个分类文件可包含
## Scope: any
(默认)和
## Scope: route=<glob>
章节。详见
references/spec_authoring.md
。流程场景以其
# app:
元数据中声明的URL为目标;路由范围限定为单流程级,而非单规则级。

Available MCP Tools (19 total)

可用MCP工具(共19个)

Session Management

会话管理

  • start_browse(url, intent, browser_mode)
    — open a URL, get a
    session_id
    . Use
    browser_mode="local"
    for verification.
  • session_close(session_id)
    — terminate browser session
  • session_list()
    — list active sessions
  • start_browse(url, intent, browser_mode)
    — 打开URL,获取
    session_id
    验证时请使用
    browser_mode="local"
  • session_close(session_id)
    — 终止浏览器会话
  • session_list()
    — 列出活跃会话

Verification — Visual only (all require
session_id
and
rules
JSON; ALWAYS pass
output_dir
= absolute path to project root)

验证 — 仅视觉验证(均需
session_id
rules
JSON;必须传入
output_dir
= 项目根目录的绝对路径

  • verify_visual_style(session_id, rules, output_dir)
  • verify_components(session_id, rules, output_dir)
  • verify_accessibility(session_id, rules, output_dir)
  • verify_project_rules(session_id, rules, output_dir)
  • verify_platform_conventions(session_id, rules, output_dir)
These are the visual-mode verification tools. For flow verification,
act()
and
act_get()
(below) are the primary drivers —
verify_*
doesn't apply to Gherkin steps.
If
output_dir
is omitted, the server writes assertions to a
/tmp/
temp dir and downstream report/annotation steps can't find them.
  • verify_visual_style(session_id, rules, output_dir)
  • verify_components(session_id, rules, output_dir)
  • verify_accessibility(session_id, rules, output_dir)
  • verify_project_rules(session_id, rules, output_dir)
  • verify_platform_conventions(session_id, rules, output_dir)
以上为视觉模式验证工具。对于流程验证,主要驱动工具为
act()
act_get()
(下文) —
verify_*
不适用于Gherkin步骤。
若省略
output_dir
,服务器会将断言写入
/tmp/
临时目录,后续的报告/标注步骤将无法找到这些断言。

Browser Interaction (all require
session_id
)

浏览器交互(均需
session_id

  • navigate(session_id, url)
    — go to URL
  • click(session_id, selector)
    — click element
  • scroll(session_id, direction, selector?)
    — scroll up/down
  • hover(session_id, selector)
    — hover element
  • press_key(session_id, key)
    — keyboard input
  • type_text(session_id, selector, text, clear_first?)
    — type into input
  • navigate(session_id, url)
    — 跳转至指定URL
  • click(session_id, selector)
    — 点击元素
  • scroll(session_id, direction, selector?)
    — 向上/向下滚动
  • hover(session_id, selector)
    — 悬停元素
  • press_key(session_id, key)
    — 键盘输入
  • type_text(session_id, selector, text, clear_first?)
    — 在输入框中输入文本

Content & Capture (all require
session_id
)

内容与捕获(均需
session_id

  • evaluate_js(session_id, script)
    — run JavaScript in page context
  • get_page_content(session_id, format?)
    — page as
    "text"
    or
    "html"
  • screenshot(session_id, destination?)
    — capture viewport to file path
  • evaluate_js(session_id, script)
    — 在页面上下文运行JavaScript
  • get_page_content(session_id, format?)
    — 获取页面的
    "text"
    "html"
    格式内容
  • screenshot(session_id, destination?)
    — 将视口内容捕获至指定文件路径

Natural Language (all require
session_id
)

自然语言交互(均需
session_id

  • act(session_id, prompt)
    — instruct browser actions (scroll, click, navigate, fill forms). For flow verification, this is the primary driver of
    Given
    and
    When
    steps. For visual verification, do NOT use
    act()
    for CSS checks — use
    verify_*
    instead.
  • act_get(session_id, prompt, schema?)
    — structured data extraction or state verification. For flow verification, this is the primary driver of
    Then
    and
    And
    (after
    Then
    ) assertions; supplement with
    evaluate_js
    for deterministic checks. For visual verification, do NOT use
    act_get()
    for CSS checks — perception/reasoning over the page is the agent's job using
    screenshot
    +
    get_page_content
    , and CSS verdicts come from
    verify_*
    .
  • act(session_id, prompt)
    — 指示浏览器执行操作(滚动、点击、跳转、填写表单)。对于流程验证,这是
    Given
    When
    步骤的主要驱动工具。对于视觉验证,请勿使用
    act()
    进行CSS检查 — 请使用
    verify_*
    工具。
  • act_get(session_id, prompt, schema?)
    — 结构化数据提取或状态验证。对于流程验证,这是
    Then
    And
    Then
    之后)断言的主要驱动工具;可配合
    evaluate_js
    进行确定性检查。对于视觉验证,请勿使用
    act_get()
    进行CSS检查 — agent需通过
    screenshot
    +
    get_page_content
    完成页面感知与推理,CSS判定结果来自
    verify_*
    工具。

Artifact Structure

产物结构

<project_root>/
  visual/design.md                   ← visual source spec (or .ui-verification/design.md)
  .ui-verification/
    .integrity.json                  ← compile-state ledger (visual only — see spec_sync.md)
    specs/                           ← compiled visual category files (INPUT to verify_*)
      visual-style.md                  (clean markdown — integrity tracked in .integrity.json)
      component-rules.md
      accessibility.md
      project-rules.md
      platform-conventions.md
    flows/                           ← flow .feature files (INPUT to act() / act_get())
      <flow-name>.feature
    sessions/                        ← per-session output (MCP-owned)
      <session_id>/
        <category>_assertions.json    (visual assertion JSON, write-once)
    reports/                         ← per-run output (skill-owned)
      <YYYYMMDD-HHmmssZ>/            ← UTC run-timestamp (a run can span multiple sessions)
        report.md                    ← combined visual + flow summary
        screenshots/                 ← visual annotated failures
        flow-reports/                ← per-flow reports
          <flow-name>.report.md
        sessions.json                ← manifest of session IDs in this run
<project_root>/
  visual/design.md                   ← 视觉源规范(或 .ui-verification/design.md)
  .ui-verification/
    .integrity.json                  ← 编译状态台账(仅视觉验证 — 详见spec_sync.md)
    specs/                           ← 已编译的视觉分类文件(verify_*的输入)
      visual-style.md                 (整洁的markdown格式 — 完整性由.integrity.json跟踪)
      component-rules.md
      accessibility.md
      project-rules.md
      platform-conventions.md
    flows/                           ← 流程.feature文件(act() / act_get()的输入)
      <flow-name>.feature
    sessions/                        ← 单会话输出(MCP管理)
      <session_id>/
        <category>_assertions.json   (视觉断言JSON,仅可写入一次)
    reports/                         ← 单运行输出(skill管理)
      <YYYYMMDD-HHmmssZ>/            ← UTC运行时间戳(一次运行可包含多个会话)
        report.md                    ← 视觉+流程合并摘要
        screenshots/                 ← 视觉验证失败项的标注截图
        flow-reports/                ← 单流程报告
          <flow-name>.report.md
        sessions.json                ← 本次运行包含的会话ID清单

Hard rules every run obeys

每次运行必须遵守的硬性规则

Default mode is "both" unless user narrows scope

默认模式为“两者都运行”,除非用户限定范围

When the user says "verify [url]", "run verification on [url]", or any unqualified verification request, the run MUST include BOTH visual and flow verification. Do NOT default to visual-only. Only narrow to one mode when the user explicitly requests it ("check styles only", "run flows only") or when the disambiguation table clearly matches a single-mode pattern.
If no
.feature
files exist, generate them (see
references/flow_generation.md
). If no
design.md
exists, generate it (see
references/spec_generation.md
). Missing artifacts trigger generation, not scope narrowing.
当用户提出“verify [url]”、“run verification on [url]”或任何未限定范围的验证请求时,运行必须同时包含视觉验证和流程验证。请勿默认仅运行视觉验证。仅当用户明确要求时(例如“仅检查样式”、“仅运行流程”),或当消歧表明确匹配单模式场景时,才限定为单一模式。
若不存在
.feature
文件,需生成该文件(详见
references/flow_generation.md
)。若不存在
design.md
,需生成该文件(详见
references/spec_generation.md
)。缺失产物时需触发生成,而非缩小验证范围。

Audit when the integrity ledger triggers

当完整性台账触发时执行审计

Before calling any
verify_*
tool, check the integrity ledger (see § "The integrity ledger covers the clean case" below for the trigger conditions). When the audit runs, reconcile each in-scope rule against the inputs (design.md, app source if accessible, running app DOM). This is a best-effort LLM check — not a substring match — because the Compiler is itself LLM-driven and rules can legitimately encode information that isn't a literal substring of design.md.
For each rule (
{Name, Selector, Property, Constraint, Scope}
), answer three questions:
  1. Intent traceable — does the rule's claim (what's being asserted: a token value, a property/value pair, an element's presence) correspond to something stated or implied by design.md, OR a component definition / theme token in source code, OR an idiom present in the running app?
  2. Constraint reconciles — does the constraint value match what design.md assigns to this element/property combination, OR what source code's theme/token files assign, OR what the running app's component renders at rest? Constraints lifted from the live site WITHOUT a design.md or source backing are contamination.
  3. Selector plausible — does the selector target the element that design.md (or source) describes? Selectors can come from the app (more specific than design.md alone could specify), but the target must match the described element.
Classify each rule as:
  • PASS — all three questions reconcile against at least one input
  • ORPHAN — the rule's claim has no source. design.md doesn't make this assertion; source doesn't define this assignment; the only "evidence" is what the live site happens to render. This is the contamination case.
  • DIVERGENT — the claim IS in design.md (token defined, component referenced) but the rule's constraint contradicts design.md's assignment. E.g. design.md says component X uses token Y, but the rule asserts component X has the value of token Z.
Skip ORPHAN and DIVERGENT rules in the current run; surface them in the report's Audit Findings section (see
references/verification_report.md
). Verifying them would either pass (silently confirming contamination) or fail (without the right reason). Continue verifying the PASS rules. The user resolves contamination on their own time with three options: drop the rule, upstream the claim into design.md and recompile, or recompile from scratch.
When to write the integrity ledger. Single rule: write it when the category files on disk equal what the Compiler would emit from the current
design.md
right now.
  • Compile finished cleanly, no skipped rules → write.
  • Selector repair or constraint syntax fix completed → write (those repairs ARE what the Compiler would emit now that the original was known to fail).
  • Audit skipped any rules (ORPHAN/DIVERGENT) → don't write. Those rules are still in the file but they're NOT what a fresh compile would emit. Leave the ledger missing/stale so the next run re-audits.
  • Verify-only run, files unchanged → no-op; don't touch the existing ledger.
The origin of selectors (DOM observation in heuristic mode, source code in source-aware mode) does NOT determine ledger eligibility. As long as rules' claims and constraints trace to design.md (the audit verifies this), the ledger reflects a valid Compiler-approved state. See
references/spec_sync.md
Compilation step 7 for the full case table.
The integrity ledger (
.integrity.json
) covers the clean case.
If the ledger says all hashes match —
design.md
and every category file — the file state is provably what the Compiler last wrote, no audit needed. See
references/spec_sync.md
§ Integrity Ledger.
Audit runs when:
  • Any category file's hash mismatches the ledger (file edited outside Compiler — prior buggy run, hand-edit, partial-write)
  • Ledger is missing (no integrity baseline, run conservatively)
  • User explicitly requests re-audit (manual correctness check; hashes can be stale even when valid if a Compiler bug wrote bad rules and updated its own hash)
Skip-audit when hashes match is a real efficiency improvement for repeat runs. But periodic manual re-audit ("re-audit visual-style") is recommended after any large compile or after suspicious changes.
Audit cost. This is one LLM reasoning pass per scoped category file (or per rule batch — agent's choice). Not free, but bounded: proportional to the rules being verified, no MCP calls. The same kind of reasoning the Compiler used to write the rules; the audit just checks "would I write this rule if I compiled fresh now?"
调用任何
verify_*
工具前,需检查完整性台账(触发条件详见下文“完整性台账覆盖无异常场景”)。当触发审计时,需协调每个范围内的规则与输入项(design.md、可访问的应用源代码、运行中的应用DOM)。这是LLM驱动的尽力而为检查 — 并非子字符串匹配 — 因为编译器本身由LLM驱动,规则可合法编码design.md中未以字面子字符串形式存在的信息。
对于每条规则(
{Name, Selector, Property, Constraint, Scope}
),需回答三个问题:
  1. 意图可追溯 — 规则的声明(断言内容:令牌值、属性/值对、元素存在性)是否与design.md中明确或隐含的内容、或源代码中的组件定义/主题令牌、或运行中应用的惯用模式相符?
  2. 约束可协调 — 约束值是否与design.md为该元素/属性组合指定的值、或源代码主题/令牌文件指定的值、或运行中应用组件默认渲染的值匹配?仅从实时站点提取且无design.md或源代码支撑的约束属于“污染”。
  3. 选择器合理 — 选择器是否指向design.md(或源代码)描述的元素?选择器可来自应用(比仅依赖design.md更具体),但目标元素必须与描述的元素匹配。
将每条规则分类为:
  • PASS — 三个问题的答案均至少与一项输入项相符
  • ORPHAN — 规则声明无来源。design.md未做出该断言;源代码未定义该赋值;唯一“证据”是实时站点当前渲染的内容。这属于“污染”场景。
  • DIVERGENT — 声明确实存在于design.md中(已定义令牌、已引用组件),但规则的约束与design.md的赋值矛盾。例如:design.md说明组件X使用令牌Y,但规则断言组件X使用令牌Z的值。
当前运行中跳过ORPHAN和DIVERGENT规则;在报告的“审计发现”章节中列出这些规则(详见
references/verification_report.md
)。验证这些规则要么会通过(默认确认“污染”),要么会失败(但原因不正确)。继续验证PASS规则。用户可通过三种方式自行解决“污染”问题:删除规则、将声明上传至design.md并重新编译、或从头重新编译。
何时写入完整性台账。单条规则:当磁盘上的分类文件与编译器基于当前
design.md
生成的内容一致时写入。
  • 编译顺利完成,无跳过规则 → 写入。
  • 已完成选择器修复或约束语法修正 → 写入(这些修复正是编译器在已知原始规则失败后会生成的内容)。
  • 审计跳过了任何规则(ORPHAN/DIVERGENT) → 不写入。这些规则仍存在于文件中,但并非重新编译会生成的内容。保留缺失/过期的台账,以便下次运行重新执行审计。
  • 仅验证运行,文件未更改 → 无操作;不修改现有台账。
选择器的来源(启发式模式下的DOM观测、源码感知模式下的源代码)决定是否符合台账写入条件。只要规则的声明和约束可追溯至design.md(由审计验证),台账即反映编译器认可的有效状态。详见
references/spec_sync.md
编译步骤7的完整场景表。
完整性台账(
.integrity.json
)覆盖无异常场景
。若台账显示所有哈希值匹配 —
design.md
和所有分类文件 — 则文件状态可证明为编译器上次写入的内容,无需执行审计。详见
references/spec_sync.md
§ 完整性台账。
在以下场景触发审计:
  • 任何分类文件的哈希值与台账不匹配(文件在编译器外部被编辑 — 之前的错误运行、手动编辑、部分写入)
  • 台账缺失(无完整性基准,保守运行)
  • 用户明确要求重新审计(手动正确性检查;即使状态有效,若编译器错误写入规则并更新哈希值,哈希值也可能过期)
当哈希值匹配时跳过审计可有效提升重复运行的效率。但建议在大型编译后或可疑变更后定期手动重新审计(例如“re-audit visual-style”)。
审计成本。这是针对每个范围内的分类文件(或规则批次 — 由agent选择)的一次LLM推理过程。并非免费,但成本可控:与待验证规则的数量成正比,无需调用MCP工具。与编译器编写规则时的推理类型相同;审计仅需检查“如果我现在重新编译,是否会编写这条规则?”

Assertion JSON is immutable

断言JSON不可修改

Files at
<output_dir>/.ui-verification/sessions/<session_id>/*_assertions.json
are write-once OUTPUT of
verify_*
. NEVER edit them. No exceptions.
The JSON records what
verify_*
saw against the live DOM. Don't rewrite values, change pass/fail, add scope, "annotate" findings, or add commentary. If a field seems missing (e.g. scope), the report layer joins it in from the source it came from (e.g. the category file) — assertion JSON itself stays exactly as the MCP server wrote it.
If you find yourself opening assertion JSON to fix something, stop — that's the report's job. The agent reads the JSON; the JSON does not change after
verify_*
writes it.
<output_dir>/.ui-verification/sessions/<session_id>/*_assertions.json
路径下的文件是
verify_*
工具的仅可写入一次的输出。绝不编辑这些文件。无例外。
JSON记录了
verify_*
工具针对实时DOM观测到的内容。请勿重写值、更改通过/失败状态、添加范围、“标注”发现结果或添加注释。若某个字段缺失(例如范围),报告层会从其来源(例如分类文件)中补充该字段 — 断言JSON本身需保持MCP服务器写入时的原样。
若你发现自己需要打开断言JSON进行修改,请停止 — 这是报告层的工作。agent可读取JSON;但
verify_*
写入后,JSON不可更改。

The 5 category files are reflections of design.md

5个分类文件是design.md的映射

The 5 compiled
.ui-verification/specs/*.md
files are derived from
design.md
. They are NOT a scratch pad, working memory, or place to record observations.
Verification mode (design.md exists):
Edit a category file ONLY when:
  • design.md
    changed (or chat became a design.md edit) → recompile the affected rules
  • Selector repair: an existing rule's selector returned "selector not found" and you found a working replacement (selector update only — name/property/constraint stay)
Do NOT edit category files to:
  • Capture an observation about the live site (that's the report's job)
  • Add a rule that "documents a divergence" with a constraint that matches the divergent live-site value (this silently encodes site bugs as truth and prevents future detection)
  • Make a failing rule pass by relaxing the constraint
  • Record findings, notes, or context
For partial / scoped verification, pick existing rules from the right category files — don't author new ones unless they're traceable back to a design.md claim that was missed during the prior compile (which is a Compiler bug to surface, not a routine action).
Generation mode (cold-compile from a live site, no
design.md
yet):
The above rule is RELAXED during generation, because the 5 mds are being seeded for the first time. Generation observes the running app and writes both
design.md
and the 5 mds in one pass. The constraints in the 5 mds at end-of-generation match the observed DOM values — that is the reverse-engineering contract, not contamination.
The "no recording observations" guard kicks in after generation completes and the user has reviewed
design.md
. From that point forward, the verification-mode rules above apply: edits go through
design.md
+ recompile, never directly to the 5 mds.
See
references/spec_generation.md
§ Phase 5 for the generation-mode rules. Source code, when accessible during generation, informs names (token names, component names) but NOT values — the DOM is authoritative for values. There is no "source vs DOM divergence" during generation: the DOM is the cascade-resolved outcome of all source CSS, and any apparent disagreement is between one source file the agent read and the same source compiled by the browser.
5个已编译的
.ui-verification/specs/*.md
文件派生自
design.md
。它们并非草稿本、工作内存或记录观测结果的地方。
验证模式(已存在design.md):
仅在以下场景编辑分类文件:
  • design.md
    已更改(或聊天内容已转化为design.md编辑操作) → 重新编译受影响的规则
  • 选择器修复:现有规则的选择器返回“selector not found”,且你找到了可用的替代选择器(仅更新选择器 — 名称/属性/约束保持不变)
请勿在以下场景编辑分类文件:
  • 记录关于实时站点的观测结果(这是报告层的工作)
  • 添加“记录差异”的规则,其约束与实时站点的差异值匹配(这会默认将站点错误编码为事实,阻碍未来的错误检测)
  • 通过放宽约束使失败的规则变为通过
  • 记录发现结果、注释或上下文信息
对于部分/范围限定的验证,请从对应分类文件中选择现有规则 — 除非规则可追溯至之前编译时遗漏的design.md声明(这属于编译器bug,需上报,而非常规操作),否则请勿编写新规则。
生成模式(从实时站点冷编译,尚未存在
design.md
):
在生成模式下,上述规则会放宽,因为5个md文件正处于首次初始化阶段。生成过程会观测运行中的应用,并一次性编写
design.md
和5个md文件。生成结束时,5个md文件中的约束与观测到的DOM值匹配 — 这是逆向工程的约定,而非“污染”。
“禁止记录观测结果”的限制在生成完成后生效,此时用户已审阅
design.md
。从该时刻起,需遵守验证模式的规则:编辑需通过
design.md
+ 重新编译完成,绝不直接编辑5个md文件。
详见
references/spec_generation.md
§ 阶段5的生成模式规则。生成过程中可访问源代码时,源代码会为名称(令牌名称、组件名称)提供信息,但不会影响 — DOM是值的权威来源。生成过程中不存在“源代码与DOM差异”:DOM是所有源代码CSS经过级联解析后的结果,任何表面上的不一致都是agent读取的某个源代码文件与浏览器编译的同一源代码之间的差异。

Each run is independent

每次运行相互独立

Do NOT read prior assertion JSON or
report.md
from earlier
sessions/<session_id>/
or
reports/<run-timestamp>/
directories. The only state carried across runs is the compiled
specs/*.md
files plus the
.integrity.json
ledger (re-compile is skipped if all ledger hashes match current files). Prior assertions and reports are historical artifacts; they don't inform the current run.
If you find yourself reading a prior session's assertions to "compare," stop — that's cross-session warm-start, which is deferred. Run fresh, write a fresh report.
请勿读取之前会话的断言JSON或
report.md
,无论这些文件位于
sessions/<session_id>/
还是
reports/<run-timestamp>/
目录。跨运行仅保留已编译的
specs/*.md
文件和
.integrity.json
台账(若台账所有哈希值与当前文件匹配,则跳过重新编译)。之前的断言和报告是历史产物;它们不会影响当前运行。
若你发现自己需要读取之前会话的断言进行“比较”,请停止 — 这属于跨会话热启动,目前暂不支持。请重新运行,生成新报告。

Flow files at
flows/
are the only flow input

flows/目录下的流程文件是流程验证的唯一输入

The
.feature
files at
<output_dir>/.ui-verification/flows/
are the only input to flow verification. Never compose flows ad-hoc from chat input mid-run; never modify
.feature
files mid-run. If the user wants to change a scenario, the change goes through the Scribe (see
references/flow_sync.md
) before the next run.
<output_dir>/.ui-verification/flows/
目录下的
.feature
文件是流程验证的唯一输入。运行过程中请勿根据聊天内容临时编写流程;运行过程中请勿修改
.feature
文件。若用户想要修改场景,需先通过Scribe工具完成修改(详见
references/flow_sync.md
),再进行下一次运行。

Flow runs are non-deterministic

流程运行具有不确定性

Do NOT carry forward prior flow session results across runs. Nova Act re-interprets steps each run, network timing varies, the live UI shifts. Carrying forward "passed" verdicts would mask real flakiness or environmental drift. Every flow runs every time. Flow-side regressions surface via the per-flow status table in the combined report, not a warm-start mechanism.
请勿跨运行沿用之前流程会话的结果。Nova Act每次运行都会重新解释步骤,网络时序会变化,实时UI会偏移。沿用“已通过”的判定会掩盖真实的不稳定问题或环境漂移。每次运行都需执行所有流程。流程侧的回归问题会通过合并报告中的单流程状态表体现,而非通过热启动机制。

Every run produces a report

每次运行必须生成报告

This rule has no exceptions. Whether the user asks to verify a whole site or a single line of
design.md
, the run isn't done until:
  1. Rules persist on disk — every rule passed to verify_* at verification time (step 6) must already exist in a category file under the right
    ## Scope:
    section. (Compile-time selector validation is a separate use of verify_*; see verification.md step 4.)
  2. Scope is joined at report-time, not stamped onto assertions — the report reads BOTH the assertion JSON (for verdicts) and the category file (for scope) and joins them on rule name. Assertion JSON stays exactly as the MCP server wrote it. See
    references/verification_report.md
    for the join.
  3. A report is written
    <output_dir>/.ui-verification/reports/<run-timestamp>/report.md
    , with the failure table and any annotated screenshots. See
    references/verification_report.md
    for format. Even an all-pass run produces a report.
  4. The user-facing summary links the report, not the assertion JSON. The JSON is intermediate output; the report is the deliverable.
A "quick check" of one or two claims is still a verification run. The same four rules apply.
本规则无例外。无论用户要求验证整个站点还是
design.md
中的某一行,运行完成的标志为:
  1. 规则持久化到磁盘 — 验证阶段(步骤6)传入verify_的每条规则必须已存在于对应分类文件的正确
    ## Scope:
    章节下。(编译阶段的选择器验证是verify_
    的另一用途;详见verification.md步骤4。)
  2. 范围在报告阶段合并,而非标记到断言上 — 报告需同时读取断言JSON(用于判定结果)和分类文件(用于范围),并通过规则名称进行关联。断言JSON需保持MCP服务器写入时的原样。详见
    references/verification_report.md
    的关联逻辑。
  3. 生成报告 — 生成
    <output_dir>/.ui-verification/reports/<run-timestamp>/report.md
    ,包含失败表和所有标注截图。详见
    references/verification_report.md
    的格式要求。即使所有验证项都通过,也需生成报告。
  4. 面向用户的摘要需链接到报告,而非断言JSON。JSON是中间输出;报告是交付产物。
即使是针对一两项声明的“快速检查”,也属于一次验证运行。上述四条规则均适用。

Workflow

工作流程

For visual verification tasks, load
references/verification.md
. For flow verification tasks, load
references/flow_verification.md
. Both reference docs have a complete decision flow for their mode.
User intentReference
Verify a live site against a design spec (visual)
references/verification.md
Run user flows against a live site
references/flow_verification.md
Generate spec from live site (no design.md exists)
references/spec_generation.md
Generate flows from a live site (no .feature files exist)
references/flow_generation.md
Compile design.md → category files; sync chat edits
references/spec_sync.md
Sync user intent →
.feature
files
references/flow_sync.md
Set up MCP server + browser session
references/setup.md
Write/edit design spec files
references/spec_authoring.md
Write
.feature
files
references/flow_authoring.md
Generate verification report (visual + flow)
references/verification_report.md
Annotate failures visually on the page
references/annotate_failures.md
Constraint syntax reference
references/constraint_reference.md
Per-category translation patterns
references/verify_visual_style.md
,
references/verify_components.md
,
references/verify_accessibility.md
,
references/verify_project_rules.md
,
references/verify_platform_conventions.md
Cross-session warm-start (deferred — not in scope)
references/warm_start.md
All references live at
./references/<name>.md
relative to this SKILL.md file. The absolute path depends on where the skill is installed:
  • Global install:
    ~/.<agent>/skills/ui-verification/references/<name>.md
  • Workspace install:
    <project_root>/.<agent>/skills/ui-verification/references/<name>.md
To resolve references, use the directory containing this SKILL.md as the base — NOT the workspace root. If your skill loader's progressive disclosure hasn't surfaced them mid-session, read them directly with the Read tool using the appropriate absolute path — never search the filesystem with
find
.
对于视觉验证任务,请加载
references/verification.md
。对于流程验证任务,请加载
references/flow_verification.md
。两份参考文档均包含对应模式的完整决策流程。
用户意图参考文档
验证实时站点是否符合设计规范(视觉)
references/verification.md
针对实时站点运行用户流程
references/flow_verification.md
从实时站点生成规范(尚未存在design.md)
references/spec_generation.md
从实时站点生成流程(尚未存在.feature文件)
references/flow_generation.md
将design.md编译为分类文件;同步聊天编辑内容
references/spec_sync.md
将用户意图同步到
.feature
文件
references/flow_sync.md
设置MCP服务器 + 浏览器会话
references/setup.md
编写/编辑设计规范文件
references/spec_authoring.md
编写
.feature
文件
references/flow_authoring.md
生成验证报告(视觉+流程)
references/verification_report.md
在页面上视觉标注失败项
references/annotate_failures.md
约束语法参考
references/constraint_reference.md
分类翻译模式
references/verify_visual_style.md
references/verify_components.md
references/verify_accessibility.md
references/verify_project_rules.md
references/verify_platform_conventions.md
跨会话热启动(暂不支持 — 不在范围内)
references/warm_start.md
所有参考文档位于本SKILL.md文件的相对路径
./references/<name>.md
。绝对路径取决于skill的安装位置:
  • 全局安装
    ~/.<agent>/skills/ui-verification/references/<name>.md
  • 工作区安装
    <project_root>/.<agent>/skills/ui-verification/references/<name>.md
解析参考文档时,请以包含本SKILL.md文件的目录为基准 — 而非工作区根目录。若skill加载器的渐进式披露未在会话中展示这些文档,请使用Read工具直接读取对应的绝对路径 — 请勿使用
find
搜索文件系统。

Disambiguation: visual vs flow vs both

消歧:视觉验证 vs 流程验证 vs 两者都运行

Match the user's request to the right mode:
Phrase patternModeAction
"verify design", "check styles", "match the spec", "is it on-brand"Visual onlyLoad
references/verification.md
"run flows", "test the user journey", "verify login works"Flow onlyLoad
references/flow_verification.md
"verify [url]", "run verification on [url]" with no further qualifierBothVisual first, then flow, into one combined report
User names a specific
.feature
file or flow ID
Flow onlyLoad
references/flow_verification.md
and run only that flow
User selects text from
design.md
or names a category
Visual onlyLoad
references/verification.md
and use partial-selection flow
When in doubt, ask the user once: "Run visual verification, flow verification, or both?" Don't guess at scope when the request is genuinely ambiguous.
将用户请求匹配到正确模式:
短语模式模式操作
"verify design"、"check styles"、"match the spec"、"is it on-brand"仅视觉验证加载
references/verification.md
"run flows"、"test the user journey"、"verify login works"仅流程验证加载
references/flow_verification.md
"verify [url]"、"run verification on [url]"且无进一步限定两者都运行先执行视觉验证,再执行流程验证,生成一份合并报告
用户指定特定
.feature
文件或流程ID
仅流程验证加载
references/flow_verification.md
并仅运行该流程
用户选择
design.md
中的文本或指定分类
仅视觉验证加载
references/verification.md
并使用部分选择流程
若存在疑问,请向用户询问一次:“运行视觉验证、流程验证,还是两者都运行?”当请求确实存在歧义时,请勿猜测范围。

Where this skill lives

本skill的位置

This skill is at
<output_dir>/.<agent>/skills/ui-verification/
for workspace-local installs, OR at
~/.<agent>/skills/ui-verification/
for global installs — wherever the skill loader picked it up from is its installed location. Do NOT search the filesystem for it. No
find ~/.<agent>
, no
find /
. Activation is the runtime's job; if you've reached this SKILL.md, the runtime already knows where you are.
对于工作区本地安装,本skill位于
<output_dir>/.<agent>/skills/ui-verification/
;对于全局安装,本skill位于
~/.<agent>/skills/ui-verification/
— skill加载器获取skill的位置即为其安装位置。请勿搜索文件系统查找本skill。请勿执行
find ~/.<agent>
find /
。激活是运行时的工作;若你已访问到本SKILL.md文件,运行时已知道你的位置。

Resolving the references directory

解析参考文档目录

The
references/
folder is always co-located with this SKILL.md file, not with the workspace or output directory. Use the path that the runtime used to load this file as the base:
Install typeSKILL.md locationReferences at
Global
~/.<agent>/skills/ui-verification/SKILL.md
~/.<agent>/skills/ui-verification/references/
Workspace
<project>/.<agent>/skills/ui-verification/SKILL.md
<project>/.<agent>/skills/ui-verification/references/
When reading a reference, construct the absolute path from the skill's install location. Example for a global install:
~/.<agent>/skills/ui-verification/references/verification.md
~/.<agent>/skills/ui-verification/references/spec_sync.md
Do NOT assume references are at
<output_dir>/.<agent>/skills/ui-verification/references/
when the skill was loaded from the global location — the workspace may not have a copy.
If you're a fresh agent on a new turn and you don't immediately have a tool from
nova-act-mcp
available, the MCP server may still be starting — wait for the runtime to surface it on the next user turn rather than searching the filesystem to "find" the skill yourself. The skill is already loaded; the tools are not always synchronously available with skill activation.
references/
文件夹始终与本SKILL.md文件位于同一目录,而非工作区或输出目录。请使用运行时加载本文件的路径作为基准:
安装类型SKILL.md位置参考文档位置
全局
~/.<agent>/skills/ui-verification/SKILL.md
~/.<agent>/skills/ui-verification/references/
工作区
<project>/.<agent>/skills/ui-verification/SKILL.md
<project>/.<agent>/skills/ui-verification/references/
读取参考文档时,请从skill的安装位置构造绝对路径。全局安装示例:
~/.<agent>/skills/ui-verification/references/verification.md
~/.<agent>/skills/ui-verification/references/spec_sync.md
请勿假设当skill从全局位置加载时,参考文档位于
<output_dir>/.<agent>/skills/ui-verification/references/
— 工作区可能没有副本。
若你是新agent且在新会话中无法立即使用
nova-act-mcp
提供的工具,可能是MCP服务器仍在启动中 — 请等待运行时在下一次用户会话中展示这些工具,而非搜索文件系统“查找”skill。skill已加载;工具并非始终与skill激活同步可用。

Don't search for tool implementations

请勿搜索工具实现

Never
find
for the MCP server source code, the constraint engine source, or any other tool implementation. The behavior of
verify_*
, the constraint syntax, the selector matching algorithm — all of this is documented in
references/constraint_reference.md
and the per-category deep-dives. If a constraint or property behaves unexpectedly during a run, read the reference, not the implementation. The references are the agent-facing documentation of record; reaching for
find
to spelunk the engine is a sign the reference needs an update, which the user can address — but in-session, work from documented behavior.
请勿使用
find
查找MCP服务器源代码、约束引擎源代码或任何其他工具实现。
verify_*
的行为、约束语法、选择器匹配算法 — 所有这些内容均记录在
references/constraint_reference.md
和各分类的深度文档中。若运行过程中约束或属性的行为不符合预期,请阅读参考文档,而非查看实现。参考文档是面向agent的官方记录;试图通过
find
查看引擎实现表明参考文档需要更新,用户可解决该问题 — 但在会话中,请基于文档记录的行为进行操作。