ui-verification
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOverview
概述
UI Verification covers two parallel modes against a live web app:
- Visual verification — checks whether the page matches its design specification. Translates design claims into CSS rule checks, runs them against the live DOM via . Deterministic; the browser's computed styles are the source of truth.
getComputedStyle() - Flow verification — checks whether user journeys complete correctly. Executes Gherkin scenarios from files via Nova Act's
.feature(actions) andact()(assertions). Non-deterministic; results vary run-to-run with network timing and live UI shifts.act_get()
Both modes share the skill, the MCP server, and the browser session. A single run can produce both kinds of output, combined into one report. Or either mode can run alone; the other section is omitted from the combined report.
Each run produces:
- Structured artifacts — per-category JSON for visual; per-flow execution data for flows
- Annotated screenshots — red bounding boxes highlighting visual failures on the page
- Verification report — markdown combining a visual summary, a flow summary (per-flow status table), and links to per-flow detail reports
The verify_* tools are deterministic — no vision model, browser's computed styles are the source of truth. The compile and audit passes for visual are LLM-driven (best-effort), reconciling design intent against the app's actual structure. Flow steps are interpreted by Nova Act each run; flow runs are inherently non-deterministic.
UI验证针对在线Web应用提供两种并行验证模式:
- 视觉验证 — 检查页面是否符合设计规范。将设计要求转化为CSS规则检查,通过针对实时DOM运行。结果具有确定性;浏览器计算出的样式为判定依据。
getComputedStyle() - 流程验证 — 检查用户旅程是否能正确完成。通过Nova Act的(操作)和
act()(断言)执行act_get()文件中的Gherkin场景。结果不具有确定性;网络时序和实时UI变化会导致每次运行结果不同。.feature
两种模式共用该skill、MCP服务器和浏览器会话。单次运行可生成两种输出,并合并为一份报告。也可单独运行任意一种模式;合并报告中会省略未运行模式的相关章节。
每次运行会生成以下内容:
- 结构化产物 — 视觉验证的分类JSON数据;流程验证的单流程执行数据
- 标注截图 — 用红色边框高亮页面上的视觉验证失败项
- 验证报告 — 包含视觉验证摘要、流程验证摘要(单流程状态表)以及单流程详情报告链接的markdown文档
verify_*系列工具具有确定性 — 不依赖视觉模型,浏览器计算出的样式为判定依据。视觉验证的编译和审计环节由LLM驱动(尽力而为),用于协调设计意图与应用实际结构。流程步骤每次运行时由Nova Act解释;流程运行本质上具有不确定性。
Reconciliation inputs
协调输入项
Visual verification. The 5 compiled category files () are reconciled from up to three inputs:
.ui-verification/specs/*.md| Input | What it provides | Required? |
|---|---|---|
| Free-form design intent — tokens, prose, design language, component definitions | Required |
| The running app | Live DOM observed via | Required for verification (informs selectors and validates rules) |
| App source code | Components, theme/tokens, CSS files — implementation truth | Optional. When accessible, makes selectors deterministic and divergence classification more precise |
Flow verification. Flows are authored or generated as files at :
.feature.ui-verification/flows/| Input | What it provides | Required? |
|---|---|---|
| Gherkin scenarios with metadata header (flow ID, type, app URL, optional auth and cleanup) | Required for flow verification |
| The running app | Target of the scenarios; Nova Act executes against the live URL | Required |
| Auth credentials | Provided when a flow's | Conditional |
When source code is accessible (the user is a developer working on their own app), the agent should detect it and use it during compile, audit, and spec generation. Source can live at the project root ( + at ) OR one level deeper (e.g. — a workspace layout where the verifier opens at the workspace root and one or more app packages live as subdirs). Sniff both. When source is not available (verifying an external site, black-box check), the skill operates in DOM-only mode — selectors are best-effort, audit relies on design.md + DOM only.
package.jsonsrc/output_diroutput_dir/<package-name>/package.jsonThe 5 category files grow as the app develops:
- design.md grows via Scribe (user expressing design intent in chat → design.md edits)
- 5 mds grow as the Compiler discovers more verifiable surfaces in the app and source
Both layers can grow, but only via their own author paths. The 5 mds are NEVER edited to record observations or freeze divergent live-site values — see hard rules below.
视觉验证。5个已编译的分类文件()由最多三类输入项协调生成:
.ui-verification/specs/*.md| 输入项 | 提供内容 | 是否必填? |
|---|---|---|
| 自由格式的设计意图 — 设计令牌、描述文本、设计语言、组件定义 | 必填 |
| 运行中的应用 | 通过 | 验证环节必填(用于确定选择器并验证规则) |
| 应用源代码 | 组件、主题/令牌、CSS文件 — 实现依据 | 可选。当可访问时,能让选择器更具确定性,并让差异分类更精准 |
流程验证。流程以文件形式编写或生成,存储于目录:
.feature.ui-verification/flows/| 输入项 | 提供内容 | 是否必填? |
|---|---|---|
| 带有元数据头部(流程ID、类型、应用URL、可选的认证和清理步骤)的Gherkin场景 | 流程验证必填 |
| 运行中的应用 | 场景的目标对象;Nova Act针对实时URL执行场景 | 必填 |
| 认证凭据 | 当流程的 | 可选(视情况而定) |
当源代码可访问时(用户为开发人员,正在处理自己的应用),agent应检测到源代码并在编译、审计和规范生成过程中使用。源代码可位于项目根目录( + 在下)或更深一级目录(例如 — 工作区布局,验证器在工作区根目录启动,一个或多个应用包作为子目录存在)。需同时检查这两种位置。当源代码不可访问时(验证外部站点,黑盒检查),skill将在仅DOM模式下运行 — 选择器为尽力而为生成,审计仅依赖design.md + DOM。
package.jsonsrc/output_diroutput_dir/<package-name>/package.json随着应用的开发,5个分类文件会逐步完善:
- design.md通过Scribe工具完善(用户在聊天中表达设计意图 → 编辑design.md)
- 5个md文件随着编译器在应用和源代码中发现更多可验证内容而逐步完善
两类文件均可扩展,但只能通过各自的编写路径进行。5个md文件绝不用于记录观测结果或固化站点的差异值 — 请遵守下文的硬性规则。
Three names — don't conflate them
三个名称 — 请勿混淆
| Name | What it identifies |
|---|---|
| This skill (the agent's playbook) |
| The MCP server providing browser + verify_* tools |
| The artifact directory at the project root (compiled specs, assertion JSON, reports) |
These are unrelated despite words overlapping. The skill does NOT live inside the artifact dir; the artifact dir is at the project root, not inside the skill.
| 名称 | 标识内容 |
|---|---|
| 本skill(agent的操作手册) |
| 提供浏览器 + verify_*工具的MCP服务器 |
| 项目根目录下的产物目录(已编译规范、断言JSON、报告) |
尽管名称存在重叠,但三者互不相关。skill并不位于产物目录内;产物目录位于项目根目录,而非skill内部。
Capabilities
功能列表
| Capability | Tool | Source File | What It Checks |
|---|---|---|---|
| Visual Style | | | Colors, typography, spacing, radii, shadows |
| Components | | | Component presence, variants, props |
| Accessibility | | | Aria roles, landmarks, heading hierarchy |
| Project Rules | | | Layout structure, spacing system, conventions |
| Platform Conventions | | | Navigation patterns, page structure |
| User Flows | | | End-to-end user journeys, functional correctness |
Visual rules can be route-scoped: each category file may contain (default) and sections. See . Flow scenarios target the URL declared in their metadata; route scoping is per-flow rather than per-rule.
## Scope: any## Scope: route=<glob>references/spec_authoring.md# app:| 功能 | 工具 | 源文件 | 检查内容 |
|---|---|---|---|
| 视觉样式 | | | 颜色、排版、间距、圆角、阴影 |
| 组件 | | | 组件存在性、变体、属性 |
| 无障碍性 | | | Aria角色、地标、标题层级 |
| 项目规则 | | | 布局结构、间距系统、约定 |
| 平台约定 | | | 导航模式、页面结构 |
| 用户流程 | | | 端到端用户旅程、功能正确性 |
视觉规则可按路由范围限定:每个分类文件可包含(默认)和章节。详见。流程场景以其元数据中声明的URL为目标;路由范围限定为单流程级,而非单规则级。
## Scope: any## Scope: route=<glob>references/spec_authoring.md# app:Available MCP Tools (19 total)
可用MCP工具(共19个)
Session Management
会话管理
- — open a URL, get a
start_browse(url, intent, browser_mode). Usesession_idfor verification.browser_mode="local" - — terminate browser session
session_close(session_id) - — list active sessions
session_list()
- — 打开URL,获取
start_browse(url, intent, browser_mode)。验证时请使用session_id。browser_mode="local" - — 终止浏览器会话
session_close(session_id) - — 列出活跃会话
session_list()
Verification — Visual only (all require session_id
and rules
JSON; ALWAYS pass output_dir
= absolute path to project root)
session_idrulesoutput_dir验证 — 仅视觉验证(均需session_id
和rules
JSON;必须传入output_dir
= 项目根目录的绝对路径)
session_idrulesoutput_dirverify_visual_style(session_id, rules, output_dir)verify_components(session_id, rules, output_dir)verify_accessibility(session_id, rules, output_dir)verify_project_rules(session_id, rules, output_dir)verify_platform_conventions(session_id, rules, output_dir)
These are the visual-mode verification tools. For flow verification, and (below) are the primary drivers — doesn't apply to Gherkin steps.
act()act_get()verify_*If is omitted, the server writes assertions to a temp dir and downstream report/annotation steps can't find them.
output_dir/tmp/verify_visual_style(session_id, rules, output_dir)verify_components(session_id, rules, output_dir)verify_accessibility(session_id, rules, output_dir)verify_project_rules(session_id, rules, output_dir)verify_platform_conventions(session_id, rules, output_dir)
以上为视觉模式验证工具。对于流程验证,主要驱动工具为和(下文) — 不适用于Gherkin步骤。
act()act_get()verify_*若省略,服务器会将断言写入临时目录,后续的报告/标注步骤将无法找到这些断言。
output_dir/tmp/Browser Interaction (all require session_id
)
session_id浏览器交互(均需session_id
)
session_id- — go to URL
navigate(session_id, url) - — click element
click(session_id, selector) - — scroll up/down
scroll(session_id, direction, selector?) - — hover element
hover(session_id, selector) - — keyboard input
press_key(session_id, key) - — type into input
type_text(session_id, selector, text, clear_first?)
- — 跳转至指定URL
navigate(session_id, url) - — 点击元素
click(session_id, selector) - — 向上/向下滚动
scroll(session_id, direction, selector?) - — 悬停元素
hover(session_id, selector) - — 键盘输入
press_key(session_id, key) - — 在输入框中输入文本
type_text(session_id, selector, text, clear_first?)
Content & Capture (all require session_id
)
session_id内容与捕获(均需session_id
)
session_id- — run JavaScript in page context
evaluate_js(session_id, script) - — page as
get_page_content(session_id, format?)or"text""html" - — capture viewport to file path
screenshot(session_id, destination?)
- — 在页面上下文运行JavaScript
evaluate_js(session_id, script) - — 获取页面的
get_page_content(session_id, format?)或"text"格式内容"html" - — 将视口内容捕获至指定文件路径
screenshot(session_id, destination?)
Natural Language (all require session_id
)
session_id自然语言交互(均需session_id
)
session_id- — instruct browser actions (scroll, click, navigate, fill forms). For flow verification, this is the primary driver of
act(session_id, prompt)andGivensteps. For visual verification, do NOT useWhenfor CSS checks — useact()instead.verify_* - — structured data extraction or state verification. For flow verification, this is the primary driver of
act_get(session_id, prompt, schema?)andThen(afterAnd) assertions; supplement withThenfor deterministic checks. For visual verification, do NOT useevaluate_jsfor CSS checks — perception/reasoning over the page is the agent's job usingact_get()+screenshot, and CSS verdicts come fromget_page_content.verify_*
- — 指示浏览器执行操作(滚动、点击、跳转、填写表单)。对于流程验证,这是
act(session_id, prompt)和Given步骤的主要驱动工具。对于视觉验证,请勿使用When进行CSS检查 — 请使用act()工具。verify_* - — 结构化数据提取或状态验证。对于流程验证,这是
act_get(session_id, prompt, schema?)和Then(And之后)断言的主要驱动工具;可配合Then进行确定性检查。对于视觉验证,请勿使用evaluate_js进行CSS检查 — agent需通过act_get()+screenshot完成页面感知与推理,CSS判定结果来自get_page_content工具。verify_*
Artifact Structure
产物结构
<project_root>/
visual/design.md ← visual source spec (or .ui-verification/design.md)
.ui-verification/
.integrity.json ← compile-state ledger (visual only — see spec_sync.md)
specs/ ← compiled visual category files (INPUT to verify_*)
visual-style.md (clean markdown — integrity tracked in .integrity.json)
component-rules.md
accessibility.md
project-rules.md
platform-conventions.md
flows/ ← flow .feature files (INPUT to act() / act_get())
<flow-name>.feature
sessions/ ← per-session output (MCP-owned)
<session_id>/
<category>_assertions.json (visual assertion JSON, write-once)
reports/ ← per-run output (skill-owned)
<YYYYMMDD-HHmmssZ>/ ← UTC run-timestamp (a run can span multiple sessions)
report.md ← combined visual + flow summary
screenshots/ ← visual annotated failures
flow-reports/ ← per-flow reports
<flow-name>.report.md
sessions.json ← manifest of session IDs in this run<project_root>/
visual/design.md ← 视觉源规范(或 .ui-verification/design.md)
.ui-verification/
.integrity.json ← 编译状态台账(仅视觉验证 — 详见spec_sync.md)
specs/ ← 已编译的视觉分类文件(verify_*的输入)
visual-style.md (整洁的markdown格式 — 完整性由.integrity.json跟踪)
component-rules.md
accessibility.md
project-rules.md
platform-conventions.md
flows/ ← 流程.feature文件(act() / act_get()的输入)
<flow-name>.feature
sessions/ ← 单会话输出(MCP管理)
<session_id>/
<category>_assertions.json (视觉断言JSON,仅可写入一次)
reports/ ← 单运行输出(skill管理)
<YYYYMMDD-HHmmssZ>/ ← UTC运行时间戳(一次运行可包含多个会话)
report.md ← 视觉+流程合并摘要
screenshots/ ← 视觉验证失败项的标注截图
flow-reports/ ← 单流程报告
<flow-name>.report.md
sessions.json ← 本次运行包含的会话ID清单Hard rules every run obeys
每次运行必须遵守的硬性规则
Default mode is "both" unless user narrows scope
默认模式为“两者都运行”,除非用户限定范围
When the user says "verify [url]", "run verification on [url]", or any unqualified verification request, the run MUST include BOTH visual and flow verification. Do NOT default to visual-only. Only narrow to one mode when the user explicitly requests it ("check styles only", "run flows only") or when the disambiguation table clearly matches a single-mode pattern.
If no files exist, generate them (see ). If no exists, generate it (see ). Missing artifacts trigger generation, not scope narrowing.
.featurereferences/flow_generation.mddesign.mdreferences/spec_generation.md当用户提出“verify [url]”、“run verification on [url]”或任何未限定范围的验证请求时,运行必须同时包含视觉验证和流程验证。请勿默认仅运行视觉验证。仅当用户明确要求时(例如“仅检查样式”、“仅运行流程”),或当消歧表明确匹配单模式场景时,才限定为单一模式。
若不存在文件,需生成该文件(详见)。若不存在,需生成该文件(详见)。缺失产物时需触发生成,而非缩小验证范围。
.featurereferences/flow_generation.mddesign.mdreferences/spec_generation.mdAudit when the integrity ledger triggers
当完整性台账触发时执行审计
Before calling any tool, check the integrity ledger (see § "The integrity ledger covers the clean case" below for the trigger conditions). When the audit runs, reconcile each in-scope rule against the inputs (design.md, app source if accessible, running app DOM). This is a best-effort LLM check — not a substring match — because the Compiler is itself LLM-driven and rules can legitimately encode information that isn't a literal substring of design.md.
verify_*For each rule (), answer three questions:
{Name, Selector, Property, Constraint, Scope}- Intent traceable — does the rule's claim (what's being asserted: a token value, a property/value pair, an element's presence) correspond to something stated or implied by design.md, OR a component definition / theme token in source code, OR an idiom present in the running app?
- Constraint reconciles — does the constraint value match what design.md assigns to this element/property combination, OR what source code's theme/token files assign, OR what the running app's component renders at rest? Constraints lifted from the live site WITHOUT a design.md or source backing are contamination.
- Selector plausible — does the selector target the element that design.md (or source) describes? Selectors can come from the app (more specific than design.md alone could specify), but the target must match the described element.
Classify each rule as:
- PASS — all three questions reconcile against at least one input
- ORPHAN — the rule's claim has no source. design.md doesn't make this assertion; source doesn't define this assignment; the only "evidence" is what the live site happens to render. This is the contamination case.
- DIVERGENT — the claim IS in design.md (token defined, component referenced) but the rule's constraint contradicts design.md's assignment. E.g. design.md says component X uses token Y, but the rule asserts component X has the value of token Z.
Skip ORPHAN and DIVERGENT rules in the current run; surface them in the report's Audit Findings section (see ). Verifying them would either pass (silently confirming contamination) or fail (without the right reason). Continue verifying the PASS rules. The user resolves contamination on their own time with three options: drop the rule, upstream the claim into design.md and recompile, or recompile from scratch.
references/verification_report.mdWhen to write the integrity ledger. Single rule: write it when the category files on disk equal what the Compiler would emit from the current right now.
design.md- Compile finished cleanly, no skipped rules → write.
- Selector repair or constraint syntax fix completed → write (those repairs ARE what the Compiler would emit now that the original was known to fail).
- Audit skipped any rules (ORPHAN/DIVERGENT) → don't write. Those rules are still in the file but they're NOT what a fresh compile would emit. Leave the ledger missing/stale so the next run re-audits.
- Verify-only run, files unchanged → no-op; don't touch the existing ledger.
The origin of selectors (DOM observation in heuristic mode, source code in source-aware mode) does NOT determine ledger eligibility. As long as rules' claims and constraints trace to design.md (the audit verifies this), the ledger reflects a valid Compiler-approved state. See Compilation step 7 for the full case table.
references/spec_sync.mdThe integrity ledger () covers the clean case. If the ledger says all hashes match — and every category file — the file state is provably what the Compiler last wrote, no audit needed. See § Integrity Ledger.
.integrity.jsondesign.mdreferences/spec_sync.mdAudit runs when:
- Any category file's hash mismatches the ledger (file edited outside Compiler — prior buggy run, hand-edit, partial-write)
- Ledger is missing (no integrity baseline, run conservatively)
- User explicitly requests re-audit (manual correctness check; hashes can be stale even when valid if a Compiler bug wrote bad rules and updated its own hash)
Skip-audit when hashes match is a real efficiency improvement for repeat runs. But periodic manual re-audit ("re-audit visual-style") is recommended after any large compile or after suspicious changes.
Audit cost. This is one LLM reasoning pass per scoped category file (or per rule batch — agent's choice). Not free, but bounded: proportional to the rules being verified, no MCP calls. The same kind of reasoning the Compiler used to write the rules; the audit just checks "would I write this rule if I compiled fresh now?"
调用任何工具前,需检查完整性台账(触发条件详见下文“完整性台账覆盖无异常场景”)。当触发审计时,需协调每个范围内的规则与输入项(design.md、可访问的应用源代码、运行中的应用DOM)。这是LLM驱动的尽力而为检查 — 并非子字符串匹配 — 因为编译器本身由LLM驱动,规则可合法编码design.md中未以字面子字符串形式存在的信息。
verify_*对于每条规则(),需回答三个问题:
{Name, Selector, Property, Constraint, Scope}- 意图可追溯 — 规则的声明(断言内容:令牌值、属性/值对、元素存在性)是否与design.md中明确或隐含的内容、或源代码中的组件定义/主题令牌、或运行中应用的惯用模式相符?
- 约束可协调 — 约束值是否与design.md为该元素/属性组合指定的值、或源代码主题/令牌文件指定的值、或运行中应用组件默认渲染的值匹配?仅从实时站点提取且无design.md或源代码支撑的约束属于“污染”。
- 选择器合理 — 选择器是否指向design.md(或源代码)描述的元素?选择器可来自应用(比仅依赖design.md更具体),但目标元素必须与描述的元素匹配。
将每条规则分类为:
- PASS — 三个问题的答案均至少与一项输入项相符
- ORPHAN — 规则声明无来源。design.md未做出该断言;源代码未定义该赋值;唯一“证据”是实时站点当前渲染的内容。这属于“污染”场景。
- DIVERGENT — 声明确实存在于design.md中(已定义令牌、已引用组件),但规则的约束与design.md的赋值矛盾。例如:design.md说明组件X使用令牌Y,但规则断言组件X使用令牌Z的值。
当前运行中跳过ORPHAN和DIVERGENT规则;在报告的“审计发现”章节中列出这些规则(详见)。验证这些规则要么会通过(默认确认“污染”),要么会失败(但原因不正确)。继续验证PASS规则。用户可通过三种方式自行解决“污染”问题:删除规则、将声明上传至design.md并重新编译、或从头重新编译。
references/verification_report.md何时写入完整性台账。单条规则:当磁盘上的分类文件与编译器基于当前生成的内容一致时写入。
design.md- 编译顺利完成,无跳过规则 → 写入。
- 已完成选择器修复或约束语法修正 → 写入(这些修复正是编译器在已知原始规则失败后会生成的内容)。
- 审计跳过了任何规则(ORPHAN/DIVERGENT) → 不写入。这些规则仍存在于文件中,但并非重新编译会生成的内容。保留缺失/过期的台账,以便下次运行重新执行审计。
- 仅验证运行,文件未更改 → 无操作;不修改现有台账。
选择器的来源(启发式模式下的DOM观测、源码感知模式下的源代码)不决定是否符合台账写入条件。只要规则的声明和约束可追溯至design.md(由审计验证),台账即反映编译器认可的有效状态。详见编译步骤7的完整场景表。
references/spec_sync.md完整性台账()覆盖无异常场景。若台账显示所有哈希值匹配 — 和所有分类文件 — 则文件状态可证明为编译器上次写入的内容,无需执行审计。详见 § 完整性台账。
.integrity.jsondesign.mdreferences/spec_sync.md在以下场景触发审计:
- 任何分类文件的哈希值与台账不匹配(文件在编译器外部被编辑 — 之前的错误运行、手动编辑、部分写入)
- 台账缺失(无完整性基准,保守运行)
- 用户明确要求重新审计(手动正确性检查;即使状态有效,若编译器错误写入规则并更新哈希值,哈希值也可能过期)
当哈希值匹配时跳过审计可有效提升重复运行的效率。但建议在大型编译后或可疑变更后定期手动重新审计(例如“re-audit visual-style”)。
审计成本。这是针对每个范围内的分类文件(或规则批次 — 由agent选择)的一次LLM推理过程。并非免费,但成本可控:与待验证规则的数量成正比,无需调用MCP工具。与编译器编写规则时的推理类型相同;审计仅需检查“如果我现在重新编译,是否会编写这条规则?”
Assertion JSON is immutable
断言JSON不可修改
Files at are write-once OUTPUT of . NEVER edit them. No exceptions.
<output_dir>/.ui-verification/sessions/<session_id>/*_assertions.jsonverify_*The JSON records what saw against the live DOM. Don't rewrite values, change pass/fail, add scope, "annotate" findings, or add commentary. If a field seems missing (e.g. scope), the report layer joins it in from the source it came from (e.g. the category file) — assertion JSON itself stays exactly as the MCP server wrote it.
verify_*If you find yourself opening assertion JSON to fix something, stop — that's the report's job. The agent reads the JSON; the JSON does not change after writes it.
verify_*<output_dir>/.ui-verification/sessions/<session_id>/*_assertions.jsonverify_*JSON记录了工具针对实时DOM观测到的内容。请勿重写值、更改通过/失败状态、添加范围、“标注”发现结果或添加注释。若某个字段缺失(例如范围),报告层会从其来源(例如分类文件)中补充该字段 — 断言JSON本身需保持MCP服务器写入时的原样。
verify_*若你发现自己需要打开断言JSON进行修改,请停止 — 这是报告层的工作。agent可读取JSON;但写入后,JSON不可更改。
verify_*The 5 category files are reflections of design.md
5个分类文件是design.md的映射
The 5 compiled files are derived from . They are NOT a scratch pad, working memory, or place to record observations.
.ui-verification/specs/*.mddesign.mdVerification mode (design.md exists):
Edit a category file ONLY when:
- changed (or chat became a design.md edit) → recompile the affected rules
design.md - Selector repair: an existing rule's selector returned "selector not found" and you found a working replacement (selector update only — name/property/constraint stay)
Do NOT edit category files to:
- Capture an observation about the live site (that's the report's job)
- Add a rule that "documents a divergence" with a constraint that matches the divergent live-site value (this silently encodes site bugs as truth and prevents future detection)
- Make a failing rule pass by relaxing the constraint
- Record findings, notes, or context
For partial / scoped verification, pick existing rules from the right category files — don't author new ones unless they're traceable back to a design.md claim that was missed during the prior compile (which is a Compiler bug to surface, not a routine action).
Generation mode (cold-compile from a live site, no yet):
design.mdThe above rule is RELAXED during generation, because the 5 mds are being seeded for the first time. Generation observes the running app and writes both and the 5 mds in one pass. The constraints in the 5 mds at end-of-generation match the observed DOM values — that is the reverse-engineering contract, not contamination.
design.mdThe "no recording observations" guard kicks in after generation completes and the user has reviewed . From that point forward, the verification-mode rules above apply: edits go through + recompile, never directly to the 5 mds.
design.mddesign.mdSee § Phase 5 for the generation-mode rules. Source code, when accessible during generation, informs names (token names, component names) but NOT values — the DOM is authoritative for values. There is no "source vs DOM divergence" during generation: the DOM is the cascade-resolved outcome of all source CSS, and any apparent disagreement is between one source file the agent read and the same source compiled by the browser.
references/spec_generation.md5个已编译的文件派生自。它们并非草稿本、工作内存或记录观测结果的地方。
.ui-verification/specs/*.mddesign.md验证模式(已存在design.md):
仅在以下场景编辑分类文件:
- 已更改(或聊天内容已转化为design.md编辑操作) → 重新编译受影响的规则
design.md - 选择器修复:现有规则的选择器返回“selector not found”,且你找到了可用的替代选择器(仅更新选择器 — 名称/属性/约束保持不变)
请勿在以下场景编辑分类文件:
- 记录关于实时站点的观测结果(这是报告层的工作)
- 添加“记录差异”的规则,其约束与实时站点的差异值匹配(这会默认将站点错误编码为事实,阻碍未来的错误检测)
- 通过放宽约束使失败的规则变为通过
- 记录发现结果、注释或上下文信息
对于部分/范围限定的验证,请从对应分类文件中选择现有规则 — 除非规则可追溯至之前编译时遗漏的design.md声明(这属于编译器bug,需上报,而非常规操作),否则请勿编写新规则。
生成模式(从实时站点冷编译,尚未存在):
design.md在生成模式下,上述规则会放宽,因为5个md文件正处于首次初始化阶段。生成过程会观测运行中的应用,并一次性编写和5个md文件。生成结束时,5个md文件中的约束与观测到的DOM值匹配 — 这是逆向工程的约定,而非“污染”。
design.md“禁止记录观测结果”的限制在生成完成后生效,此时用户已审阅。从该时刻起,需遵守验证模式的规则:编辑需通过 + 重新编译完成,绝不直接编辑5个md文件。
design.mddesign.md详见 § 阶段5的生成模式规则。生成过程中可访问源代码时,源代码会为名称(令牌名称、组件名称)提供信息,但不会影响值 — DOM是值的权威来源。生成过程中不存在“源代码与DOM差异”:DOM是所有源代码CSS经过级联解析后的结果,任何表面上的不一致都是agent读取的某个源代码文件与浏览器编译的同一源代码之间的差异。
references/spec_generation.mdEach run is independent
每次运行相互独立
Do NOT read prior assertion JSON or from earlier or directories. The only state carried across runs is the compiled files plus the ledger (re-compile is skipped if all ledger hashes match current files). Prior assertions and reports are historical artifacts; they don't inform the current run.
report.mdsessions/<session_id>/reports/<run-timestamp>/specs/*.md.integrity.jsonIf you find yourself reading a prior session's assertions to "compare," stop — that's cross-session warm-start, which is deferred. Run fresh, write a fresh report.
请勿读取之前会话的断言JSON或,无论这些文件位于还是目录。跨运行仅保留已编译的文件和台账(若台账所有哈希值与当前文件匹配,则跳过重新编译)。之前的断言和报告是历史产物;它们不会影响当前运行。
report.mdsessions/<session_id>/reports/<run-timestamp>/specs/*.md.integrity.json若你发现自己需要读取之前会话的断言进行“比较”,请停止 — 这属于跨会话热启动,目前暂不支持。请重新运行,生成新报告。
Flow files at flows/
are the only flow input
flows/flows/目录下的流程文件是流程验证的唯一输入
The files at are the only input to flow verification. Never compose flows ad-hoc from chat input mid-run; never modify files mid-run. If the user wants to change a scenario, the change goes through the Scribe (see ) before the next run.
.feature<output_dir>/.ui-verification/flows/.featurereferences/flow_sync.md<output_dir>/.ui-verification/flows/.feature.featurereferences/flow_sync.mdFlow runs are non-deterministic
流程运行具有不确定性
Do NOT carry forward prior flow session results across runs. Nova Act re-interprets steps each run, network timing varies, the live UI shifts. Carrying forward "passed" verdicts would mask real flakiness or environmental drift. Every flow runs every time. Flow-side regressions surface via the per-flow status table in the combined report, not a warm-start mechanism.
请勿跨运行沿用之前流程会话的结果。Nova Act每次运行都会重新解释步骤,网络时序会变化,实时UI会偏移。沿用“已通过”的判定会掩盖真实的不稳定问题或环境漂移。每次运行都需执行所有流程。流程侧的回归问题会通过合并报告中的单流程状态表体现,而非通过热启动机制。
Every run produces a report
每次运行必须生成报告
This rule has no exceptions. Whether the user asks to verify a whole site or a single line of , the run isn't done until:
design.md- Rules persist on disk — every rule passed to verify_* at verification time (step 6) must already exist in a category file under the right section. (Compile-time selector validation is a separate use of verify_*; see verification.md step 4.)
## Scope: - Scope is joined at report-time, not stamped onto assertions — the report reads BOTH the assertion JSON (for verdicts) and the category file (for scope) and joins them on rule name. Assertion JSON stays exactly as the MCP server wrote it. See for the join.
references/verification_report.md - A report is written — , with the failure table and any annotated screenshots. See
<output_dir>/.ui-verification/reports/<run-timestamp>/report.mdfor format. Even an all-pass run produces a report.references/verification_report.md - The user-facing summary links the report, not the assertion JSON. The JSON is intermediate output; the report is the deliverable.
A "quick check" of one or two claims is still a verification run. The same four rules apply.
本规则无例外。无论用户要求验证整个站点还是中的某一行,运行完成的标志为:
design.md- 规则持久化到磁盘 — 验证阶段(步骤6)传入verify_的每条规则必须已存在于对应分类文件的正确章节下。(编译阶段的选择器验证是verify_的另一用途;详见verification.md步骤4。)
## Scope: - 范围在报告阶段合并,而非标记到断言上 — 报告需同时读取断言JSON(用于判定结果)和分类文件(用于范围),并通过规则名称进行关联。断言JSON需保持MCP服务器写入时的原样。详见的关联逻辑。
references/verification_report.md - 生成报告 — 生成,包含失败表和所有标注截图。详见
<output_dir>/.ui-verification/reports/<run-timestamp>/report.md的格式要求。即使所有验证项都通过,也需生成报告。references/verification_report.md - 面向用户的摘要需链接到报告,而非断言JSON。JSON是中间输出;报告是交付产物。
即使是针对一两项声明的“快速检查”,也属于一次验证运行。上述四条规则均适用。
Workflow
工作流程
For visual verification tasks, load . For flow verification tasks, load . Both reference docs have a complete decision flow for their mode.
references/verification.mdreferences/flow_verification.md| User intent | Reference |
|---|---|
| Verify a live site against a design spec (visual) | |
| Run user flows against a live site | |
| Generate spec from live site (no design.md exists) | |
| Generate flows from a live site (no .feature files exist) | |
| Compile design.md → category files; sync chat edits | |
Sync user intent → | |
| Set up MCP server + browser session | |
| Write/edit design spec files | |
Write | |
| Generate verification report (visual + flow) | |
| Annotate failures visually on the page | |
| Constraint syntax reference | |
| Per-category translation patterns | |
| Cross-session warm-start (deferred — not in scope) | |
All references live at relative to this SKILL.md file. The absolute path depends on where the skill is installed:
./references/<name>.md- Global install:
~/.<agent>/skills/ui-verification/references/<name>.md - Workspace install:
<project_root>/.<agent>/skills/ui-verification/references/<name>.md
To resolve references, use the directory containing this SKILL.md as the base — NOT the workspace root. If your skill loader's progressive disclosure hasn't surfaced them mid-session, read them directly with the Read tool using the appropriate absolute path — never search the filesystem with .
find对于视觉验证任务,请加载。对于流程验证任务,请加载。两份参考文档均包含对应模式的完整决策流程。
references/verification.mdreferences/flow_verification.md| 用户意图 | 参考文档 |
|---|---|
| 验证实时站点是否符合设计规范(视觉) | |
| 针对实时站点运行用户流程 | |
| 从实时站点生成规范(尚未存在design.md) | |
| 从实时站点生成流程(尚未存在.feature文件) | |
| 将design.md编译为分类文件;同步聊天编辑内容 | |
将用户意图同步到 | |
| 设置MCP服务器 + 浏览器会话 | |
| 编写/编辑设计规范文件 | |
编写 | |
| 生成验证报告(视觉+流程) | |
| 在页面上视觉标注失败项 | |
| 约束语法参考 | |
| 分类翻译模式 | |
| 跨会话热启动(暂不支持 — 不在范围内) | |
所有参考文档位于本SKILL.md文件的相对路径。绝对路径取决于skill的安装位置:
./references/<name>.md- 全局安装:
~/.<agent>/skills/ui-verification/references/<name>.md - 工作区安装:
<project_root>/.<agent>/skills/ui-verification/references/<name>.md
解析参考文档时,请以包含本SKILL.md文件的目录为基准 — 而非工作区根目录。若skill加载器的渐进式披露未在会话中展示这些文档,请使用Read工具直接读取对应的绝对路径 — 请勿使用搜索文件系统。
findDisambiguation: visual vs flow vs both
消歧:视觉验证 vs 流程验证 vs 两者都运行
Match the user's request to the right mode:
| Phrase pattern | Mode | Action |
|---|---|---|
| "verify design", "check styles", "match the spec", "is it on-brand" | Visual only | Load |
| "run flows", "test the user journey", "verify login works" | Flow only | Load |
| "verify [url]", "run verification on [url]" with no further qualifier | Both | Visual first, then flow, into one combined report |
User names a specific | Flow only | Load |
User selects text from | Visual only | Load |
When in doubt, ask the user once: "Run visual verification, flow verification, or both?" Don't guess at scope when the request is genuinely ambiguous.
将用户请求匹配到正确模式:
| 短语模式 | 模式 | 操作 |
|---|---|---|
| "verify design"、"check styles"、"match the spec"、"is it on-brand" | 仅视觉验证 | 加载 |
| "run flows"、"test the user journey"、"verify login works" | 仅流程验证 | 加载 |
| "verify [url]"、"run verification on [url]"且无进一步限定 | 两者都运行 | 先执行视觉验证,再执行流程验证,生成一份合并报告 |
用户指定特定 | 仅流程验证 | 加载 |
用户选择 | 仅视觉验证 | 加载 |
若存在疑问,请向用户询问一次:“运行视觉验证、流程验证,还是两者都运行?”当请求确实存在歧义时,请勿猜测范围。
Where this skill lives
本skill的位置
This skill is at for workspace-local installs, OR at for global installs — wherever the skill loader picked it up from is its installed location. Do NOT search the filesystem for it. No , no . Activation is the runtime's job; if you've reached this SKILL.md, the runtime already knows where you are.
<output_dir>/.<agent>/skills/ui-verification/~/.<agent>/skills/ui-verification/find ~/.<agent>find /对于工作区本地安装,本skill位于;对于全局安装,本skill位于 — skill加载器获取skill的位置即为其安装位置。请勿搜索文件系统查找本skill。请勿执行或。激活是运行时的工作;若你已访问到本SKILL.md文件,运行时已知道你的位置。
<output_dir>/.<agent>/skills/ui-verification/~/.<agent>/skills/ui-verification/find ~/.<agent>find /Resolving the references directory
解析参考文档目录
The folder is always co-located with this SKILL.md file, not with the workspace or output directory. Use the path that the runtime used to load this file as the base:
references/| Install type | SKILL.md location | References at |
|---|---|---|
| Global | | |
| Workspace | | |
When reading a reference, construct the absolute path from the skill's install location. Example for a global install:
~/.<agent>/skills/ui-verification/references/verification.md
~/.<agent>/skills/ui-verification/references/spec_sync.mdDo NOT assume references are at when the skill was loaded from the global location — the workspace may not have a copy.
<output_dir>/.<agent>/skills/ui-verification/references/If you're a fresh agent on a new turn and you don't immediately have a tool from available, the MCP server may still be starting — wait for the runtime to surface it on the next user turn rather than searching the filesystem to "find" the skill yourself. The skill is already loaded; the tools are not always synchronously available with skill activation.
nova-act-mcpreferences/| 安装类型 | SKILL.md位置 | 参考文档位置 |
|---|---|---|
| 全局 | | |
| 工作区 | | |
读取参考文档时,请从skill的安装位置构造绝对路径。全局安装示例:
~/.<agent>/skills/ui-verification/references/verification.md
~/.<agent>/skills/ui-verification/references/spec_sync.md请勿假设当skill从全局位置加载时,参考文档位于 — 工作区可能没有副本。
<output_dir>/.<agent>/skills/ui-verification/references/若你是新agent且在新会话中无法立即使用提供的工具,可能是MCP服务器仍在启动中 — 请等待运行时在下一次用户会话中展示这些工具,而非搜索文件系统“查找”skill。skill已加载;工具并非始终与skill激活同步可用。
nova-act-mcpDon't search for tool implementations
请勿搜索工具实现
Never for the MCP server source code, the constraint engine source, or any other tool implementation. The behavior of , the constraint syntax, the selector matching algorithm — all of this is documented in and the per-category deep-dives. If a constraint or property behaves unexpectedly during a run, read the reference, not the implementation. The references are the agent-facing documentation of record; reaching for to spelunk the engine is a sign the reference needs an update, which the user can address — but in-session, work from documented behavior.
findverify_*references/constraint_reference.mdfind请勿使用查找MCP服务器源代码、约束引擎源代码或任何其他工具实现。的行为、约束语法、选择器匹配算法 — 所有这些内容均记录在和各分类的深度文档中。若运行过程中约束或属性的行为不符合预期,请阅读参考文档,而非查看实现。参考文档是面向agent的官方记录;试图通过查看引擎实现表明参考文档需要更新,用户可解决该问题 — 但在会话中,请基于文档记录的行为进行操作。
findverify_*references/constraint_reference.mdfind