tests
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTesting Guidelines
测试指南
Testing Doctrine
测试原则
Two types of tests are preferred:
- True integration tests — use real runtimes, real filesystems, real network calls. No mocks, stubs, or fakes. These prove the system works end-to-end.
- Unit tests on pure/isolated logic — test pure functions or well-isolated modules where inputs and outputs are clear. No mocks needed because the code has no external dependencies.
Unit tests are located colocated with the source they test as ".test.ts[x]" files.
Integration tests are located in , with these primary harnesses:
tests/- — tests that rely on the IPC and are focussed on ensuring backend behavior.
tests/ipc - — frontend integration tests that use the real IPC and happy-dom Full App rendering.
tests/ui - - end-to-end tests using Playwright which are needed to verify browser behavior that can't be easily tested with happy-dom.
tests/e2e
Additionally, we have stories in that are primarily used for human visual
verification of UI changes.
src/browser/stories我们优先采用两类测试:
- 真实集成测试 — 使用真实运行时、真实文件系统、真实网络调用。不使用模拟(mocks)、存根(stubs)或伪造(fakes)对象。这类测试用于验证系统端到端的运行情况。
- 纯/隔离逻辑的单元测试 — 测试纯函数或隔离良好的模块,这类模块的输入输出清晰明确。由于代码没有外部依赖,因此无需使用模拟对象。
单元测试文件与被测源代码放在同一目录下,命名为。
.test.ts[x]集成测试位于目录下,主要分为以下测试工具集:
tests/- — 依赖IPC的测试,重点验证后端行为。
tests/ipc - — 前端集成测试,使用真实IPC和happy-dom进行完整应用渲染。
tests/ui - - 基于Playwright的端到端测试,用于验证happy-dom无法轻松测试的浏览器行为。
tests/e2e
此外,我们在目录下有Story文件,主要用于人工视觉验证UI变更。
src/browser/storiesMocking
模拟对象使用规范
Avoid mock-heavy tests that verify implementation details rather than behavior.
If you need mocks to test something, consider whether the code should be restructured to be more testable.
There is at least one exception to this rule: we have a that can be used to simulate
LLM responses. Broadly the use of LLMs in tests follow these rules:
mockAiRouter- Use real LLM for tests that verify our integration with the LLM provider
- E.g., asserting that we correctly identify a context exceeded error
- Use a mockAiRouter for tests that verify behavior around the LLM logic
- E.g., asserting that messages queue correctly following an LLM response
- Do not use a real LLM or mockAiRouter for logic that in no way touches agentic behavior.
- E.g., a test that shows that opening a Terminal window works
Avoid tautological tests (simple mappings, identical copies of implementation); focus on invariants and boundary failures.
应避免过度依赖模拟对象的测试,这类测试验证的是实现细节而非行为本身。
如果必须使用模拟对象才能进行测试,请考虑是否应重构代码以提升可测试性。
本规则存在至少一个例外:我们提供了用于模拟LLM的响应。在测试中使用LLM需遵循以下规则:
mockAiRouter- 验证与LLM供应商的集成时,使用真实LLM
- 例如,断言我们能正确识别上下文超出限制的错误
- 验证LLM相关逻辑的行为时,使用
mockAiRouter- 例如,断言消息在LLM响应后能正确排队
- 对于完全不涉及Agent行为的逻辑,不要使用真实LLM或
mockAiRouter- 例如,验证终端窗口能否正常打开的测试
避免循环论证式的测试(如简单映射、与实现代码完全一致的复制);应专注于不变量和边界故障测试。
When To Test
测试时机
Ideally, all new features and bugs are well tested. Do not implement a feature or fix without a
robust testing strategy.
When fixing bugs, always start with the test (practice TDD). Reproduce the bug in the test, then fix
the production code, then verify the test passes.
理想情况下,所有新功能和Bug修复都应配有完善的测试。在没有可靠测试策略的情况下,不要实现功能或修复Bug。
修复Bug时,应从编写测试开始(践行测试驱动开发TDD)。先在测试中复现Bug,然后修复生产代码,最后验证测试通过。
Test Runtime
测试运行环境
All tests in are run under with set.
tests/bun x jestTEST_INTEGRATION=1Otherwise, tests that live in run under (generally these are unit tests).
src/bun testtests/bun x jestTEST_INTEGRATION=1而目录下的测试(通常为单元测试)则在环境中运行。
src/bun testRuntime & Checks
运行时与检查规范
- Never kill the running Mux process; rely on the following for local validation:
make typecheck- (includes typecheck, lint, fmt-check, and docs link validation)
make static-check - targeted test invocations (e.g. )
bun x jest tests/ipc/sendMessage.test.ts -t "pattern"
- Only wait on CI to pass when local, targeted checks pass.
- Prefer surgical test invocations over running full suites.
- Keep utils pure or parameterize external effects for easier testing.
- 切勿终止正在运行的Mux进程;本地验证请依赖以下方式:
make typecheck- (包括类型检查、代码 lint、格式检查和文档链接验证)
make static-check - 针对性的测试调用(例如)
bun x jest tests/ipc/sendMessage.test.ts -t "pattern"
- 只有当本地针对性检查通过后,再等待CI通过。
- 优先使用针对性测试调用,而非运行完整测试套件。
- 保持工具函数的纯质性,或对外部影响进行参数化,以便于测试。
Coverage Expectations
覆盖率预期
- Prefer fixes that simplify existing code; such simplifications often do not need new tests.
- When adding complexity, add or extend tests. If coverage requires new infrastructure, propose the harness and then add the tests there.
- When asked for TDD, write real repo tests (no scripts) and commit them.
/tmp - Pull complex logic into easily tested utils. Target broad coverage with minimal cases that prove the feature matters.
- 优先选择简化现有代码的修复方案;这类简化通常无需新增测试。
- 当增加代码复杂度时,需新增或扩展测试。如果覆盖率要求需要新的基础设施,请先提出工具集方案,再在其中添加测试。
- 当要求践行TDD时,请编写真实的仓库测试(不要使用脚本)并提交。
/tmp - 将复杂逻辑抽离为易于测试的工具函数。用最少的测试用例实现广泛的覆盖,以验证功能的必要性。
Storybook
Storybook使用规范
- Ensure all UI changes are captured by at least one story.
- Many changes will already be captured by an existing story.
- Only use full-app stories (). Do not add isolated component stories, even for small UI changes (they are not used/accepted in this repo).
App.*.stories.tsx - Use play functions with utilities (
@storybook/test,within,userEvent) to interact with the UI and set up the desired visual state. Do not add props to production components solely for storybook convenience.waitFor - Keep story data deterministic: avoid ,
Math.random(), or other non-deterministic values in story setup. Pass explicit values when ordering or timing matters for visual stability.Date.now() - Scroll stabilization: After async operations that change element sizes (Shiki highlighting, Mermaid rendering, tool expansion), wait for 's ResizeObserver RAF to complete. Use double-RAF:
useAutoScroll.await new Promise(r => requestAnimationFrame(() => requestAnimationFrame(r)))
- 确保所有UI变更都至少被一个Story捕获。
- 许多变更会被已有的Story自动覆盖。
- 仅使用完整应用的Story()。即使是小的UI变更,也不要添加孤立组件的Story(本仓库不接受这类Story)。
App.*.stories.tsx - 使用工具(
@storybook/test、within、userEvent)编写play函数,与UI交互并设置所需的视觉状态。不要为了Storybook的便利而向生产组件添加props。waitFor - 保持Story数据的确定性:在Story设置中避免使用、
Math.random()或其他非确定性值。当排序或时间对视觉稳定性有影响时,传入明确的值。Date.now() - 滚动稳定:在会改变元素尺寸的异步操作(如Shiki语法高亮、Mermaid渲染、工具展开)之后,等待的ResizeObserver RAF完成。可使用双重RAF:
useAutoScroll。await new Promise(r => requestAnimationFrame(() => requestAnimationFrame(r)))
UI Tests (tests/ui
)
tests/uiUI测试(tests/ui
)
tests/ui- Tests in must render the full app via
tests/uiand drive interactions from the user's perspective (clicking, typing, navigating).AppLoader - Use helper or similar patterns that render
renderReviewPanel().<AppLoader client={apiClient} /> - Never test isolated components or utility functions here—those belong as unit tests beside implementation ().
*.test.ts - Never call backend APIs directly (e.g., ) to trigger actions that you're testing—always simulate the user action (click the delete button, etc.). Calling the API bypasses frontend logic like navigation, state updates, and error handling, which is often where bugs hide.
env.orpc.workspace.remove()- Backend API calls are fine for setup/teardown or to avoid expensive operations.
- Consider moving the test to if backend logic needs granular testing.
tests/ipc
- Never bypass the UI in these tests; e.g. do not call to change UI state—go through the UI to trigger the desired behavior.
updatePersistedState - These tests require ; use
TEST_INTEGRATION=1guard.shouldRunIntegrationTests() - Only call in tests that actually make AI API calls. Pure UI interaction tests (clicking buttons, selecting items) don't need API keys.
validateApiKeys()
- 中的测试必须通过
tests/ui渲染完整应用,并从用户视角驱动交互(点击、输入、导航)。AppLoader - 使用辅助函数或类似模式,渲染
renderReviewPanel()。<AppLoader client={apiClient} /> - 不要在此处测试孤立组件或工具函数——这类测试应作为单元测试放在实现代码旁边()。
*.test.ts - 切勿直接调用后端API(例如)来触发你要测试的操作——始终模拟用户操作(如点击删除按钮等)。直接调用API会绕过前端的逻辑,如导航、状态更新和错误处理,而这些地方往往是Bug的藏身之处。
env.orpc.workspace.remove()- 后端API调用可用于测试的设置/清理阶段,或用于避免昂贵的操作。
- 如果需要对后端逻辑进行细粒度测试,可考虑将测试移至目录。
tests/ipc
- 不要在这类测试中绕过UI;例如,不要调用来更改UI状态——应通过UI操作触发所需行为。
updatePersistedState - 这些测试需要设置;使用
TEST_INTEGRATION=1进行条件判断。shouldRunIntegrationTests() - 仅在实际调用AI API的测试中调用。纯UI交互测试(如点击按钮、选择项目)无需API密钥。
validateApiKeys()
Happy-dom Limitations
Happy-dom限制
- Radix Portal content renders in — happy-dom doesn't place it under
document.body, so queries scoped to the app root will miss dialog/popover content.view.container - Workaround: For portal-based UI (Dialog/Popover/Tooltip), query (or
view.container.ownerDocument.body) and drive interactions there. Preferwithin(document.body)for typing to ensure controlled inputs update.userEvent - If portal content still doesn't appear due to missing browser APIs, prefer conditional rendering () like
{isOpen && <div>...}, or fall back to tests/e2e (~2min startup time).AgentModePicker
- Radix Portal内容渲染在中 — happy-dom不会将其放在
document.body下,因此作用于应用根节点的查询会无法找到对话框/弹出框内容。view.container - 解决方法:对于基于Portal的UI(对话框/弹出框/提示框),请查询(或使用
view.container.ownerDocument.body)并在其中驱动交互。优先使用within(document.body)进行输入操作,以确保受控组件的状态正确更新。userEvent - 如果由于缺少浏览器API导致Portal内容仍无法显示,可优先使用条件渲染(如,类似
{isOpen && <div>...}的实现),或退而使用AgentModePicker测试(启动时间约2分钟)。tests/e2e
Test Helper Conventions
测试辅助函数规范
- Query elements within (not
view.container) when using non-portal components.document.body - Use with explicit error messages to aid debugging:
waitFor().if (!el) throw new Error("Element not found") - Name helpers after user actions: ,
openBaseSelectorDropdown(), not implementation details.selectSuggestion()
- 当使用非Portal组件时,在内查询元素。
view.container - 使用带有明确错误信息的辅助调试:
waitFor()。if (!el) throw new Error("Element not found") - 辅助函数的命名应基于用户操作:例如、
openBaseSelectorDropdown(),而非实现细节。selectSuggestion()
IPC Tests (tests/ipc
)
tests/ipcIPC测试(tests/ipc
)
tests/ipcStrive to test the backend entirely via IPC interactions. Avoid directly asserting or modifying
backend state here.
Exceptions include:
- Building a large history to test otherwise expensive operations (e.g. long-context handling)
- Testing logic where an observable side-effect is a part of the API contract, e.g. createProject creating a project directory if it doesn't already exist.
应尽可能完全通过IPC交互来测试后端。避免在此处直接断言或修改后端状态。
例外情况包括:
- 构建大型历史记录以测试原本开销巨大的操作(例如长上下文处理)
- 测试那些可观察副作用属于API契约一部分的逻辑,例如在项目目录不存在时创建该目录。
createProject
Determinism
确定性
Strive to minimize raciness of tests. They run in a variety of environments, including bogged down CI runners.
Prefer explicit synchronization over arbitrary sleeps.
When explicit synchronization is not feasible, use patterns such as which can complete quickly in the common case.
waitFor应尽可能减少测试的不确定性。测试会在多种环境中运行,包括负载较高的CI runner。
优先使用显式同步,而非任意等待。
当显式同步不可行时,可使用等模式,这类模式在常规情况下可快速完成。
waitForExceptions
例外情况
In some cases due to infrastructure or performance constraints we may opt to diverge from these guidelines.
In such cases, ensure the test code (or production code which lacks tests) is well commented with the rationale
behind the exception.
在某些情况下,由于基础设施或性能限制,我们可能会偏离这些指南。
在这种情况下,确保测试代码(或未配备测试的生产代码)有详细的注释,说明例外的理由。