validate-ui

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Archon Web UI — Comprehensive E2E Validation

Archon Web UI — 全面端到端验证

Run exhaustive end-to-end browser automation tests and codebase review of the Archon Web UI. The goal: determine whether Archon is doing the best it possibly can to solve the problem of managing parallel agents, executing custom workflows, and providing full visibility into agent work.

Optional focus argument:

$ARGUMENTS

(e.g., "workflows", "chat", "projects"). If empty, run ALL sections.

对Archon Web UI执行详尽的端到端浏览器自动化测试和代码库审查。目标：判断Archon是否在最大程度上解决了并行Agent管理、自定义工作流执行以及Agent工作全可见性的问题。

可选聚焦参数：

$ARGUMENTS

（例如："workflows"、"chat"、"projects"）。若为空，则运行所有模块。

Phase 0: Environment Setup

阶段0：环境搭建

0.1 Kill Old Archon Processes

0.1 终止旧的Archon进程

bash

undefined

bash

undefined

Kill any running Archon dev servers (backend + frontend)

pkill -f "bun.*dev:server" 2>/dev/null || true pkill -f "bun.*dev:web" 2>/dev/null || true pkill -f "bun.*packages/server" 2>/dev/null || true pkill -f "bun.*packages/web" 2>/dev/null || true pkill -f "vite.*5173" 2>/dev/null || true

Kill any leftover processes on our ports

lsof -ti:3090 | xargs kill -9 2>/dev/null || true lsof -ti:5173 | xargs kill -9 2>/dev/null || true

Wait for ports to free up

sleep 2

Verify ports are free

! lsof -i:3090 && ! lsof -i:5173 && echo "Ports 3090 and 5173 are free" || echo "WARNING: Ports still in use"

undefined

! lsof -i:3090 && ! lsof -i:5173 && echo "Ports 3090 and 5173 are free" || echo "WARNING: Ports still in use"

undefined

0.2 Install agent-browser (if needed)

0.2 安装agent-browser（如需）

bash

undefined

bash

undefined

Check if agent-browser is available

which agent-browser 2>/dev/null || npx agent-browser --version 2>/dev/null

If not installed globally, install it:

npm install -g agent-browser && agent-browser install

On WSL2/Linux, use --with-deps to get Chromium system dependencies:

agent-browser install --with-deps

IMPORTANT: Do NOT use bunx — Bun skips postinstall scripts that agent-browser needs.

Use npx or global npm install.

undefined

undefined

0.3 Start Archon Backend + Frontend

0.3 启动Archon后端 + 前端

Start both services. Backend must be up before frontend SSE connections work.

bash

undefined

启动两个服务。前端的SSE连接正常工作前，后端必须先启动。

bash

undefined

From the repo root: /path/to/archon

Start backend (port 3090)

cd /path/to/archon && bun run dev:server & sleep 5 # Wait for server initialization + DB

Verify backend is healthy

curl -s http://localhost:3090/api/health | head -c 200

Start frontend (port 5173)

cd /path/to/archon && bun run dev:web & sleep 5 # Wait for Vite dev server

Verify frontend is serving

curl -s http://localhost:5173 | head -c 200


**URLs:**
- Frontend: `http://localhost:5173`
- Backend API: `http://localhost:3090/api`
- SSE streams: `http://localhost:3090/api/stream/{conversationId}` (bypasses Vite proxy in dev)

curl -s http://localhost:5173 | head -c 200


**URLs：**
- 前端：`http://localhost:5173`
- 后端API：`http://localhost:3090/api`
- SSE流：`http://localhost:3090/api/stream/{conversationId}`（开发环境中绕过Vite代理）

0.4 Seed Test Data (if needed)

0.4 植入测试数据（如需）

Check if there are existing codebases and conversations. If empty, create test data:

bash

undefined

检查是否存在现有代码库和对话。若为空，则创建测试数据：

bash

undefined

Check existing codebases

curl -s http://localhost:3090/api/codebases | python3 -m json.tool 2>/dev/null || curl -s http://localhost:3090/api/codebases

Register the current repo as a codebase (if none exist)

curl -s -X POST http://localhost:3090/api/codebases
-H "Content-Type: application/json"
-d '{"path": "/path/to/archon"}'

Create a test conversation

curl -s -X POST http://localhost:3090/api/conversations
-H "Content-Type: application/json"
-d '{}' | python3 -m json.tool 2>/dev/null

---

curl -s -X POST http://localhost:3090/api/conversations
-H "Content-Type: application/json"
-d '{}' | python3 -m json.tool 2>/dev/null

---

Phase 1: Browser Automation — End-to-End Testing

阶段1：浏览器自动化 — 端到端测试

Use the

agent-browser

CLI for all browser interactions. Follow the snapshot-refs workflow:

```
agent-browser open <url>
```
— navigate
```
agent-browser snapshot -i
```
— get interactive elements with refs
Interact using refs (click, fill, etc.)
Re-snapshot after navigation or DOM changes

Take screenshots at each major test point:

agent-browser screenshot /tmp/archon-test-{name}.png

使用

agent-browser

CLI进行所有浏览器交互。遵循快照引用工作流：

```
agent-browser open <url>
```
— 导航
```
agent-browser snapshot -i
```
— 获取带引用的交互元素
使用引用进行交互（点击、填充等）
导航或DOM变更后重新快照

在每个主要测试点截图：

agent-browser screenshot /tmp/archon-test-{name}.png

Test Suite 1: Dashboard (Route:

)

测试套件1：仪表盘（路由：

）

1.1 Initial Load

Open
```
http://localhost:5173
```
Verify dashboard renders: stats cards (Running Workflows, Conversations, System Status)
Check system health indicator shows "Healthy" (green)
Screenshot the full dashboard

1.2 Stats Accuracy

Compare "Running Workflows" count against
```
GET /api/workflows/runs?status=running
```
Compare "Conversations" count against
```
GET /api/conversations
```
Verify numbers update after creating new data

1.3 Recent Items

Verify "Recent Conversations" list shows up to 10 items
Verify "Recent Workflow Runs" list shows up to 10 items
Click a conversation — verify navigation to
```
/chat/{id}
```
Click a workflow run — verify navigation to
```
/workflows/runs/{id}
```
Use browser back button — verify return to dashboard

1.4 Empty State

If no conversations/runs exist: verify the empty state with "New Chat" CTA renders
Click "New Chat" from empty state — verify navigation to
```
/chat
```

1.1 初始加载

打开
```
http://localhost:5173
```
验证仪表盘渲染：统计卡片（运行中工作流、对话、系统状态）
检查系统健康指示器显示"Healthy"（绿色）
截图完整仪表盘

1.2 统计准确性

将"运行中工作流"计数与
```
GET /api/workflows/runs?status=running
```
的结果对比
将"对话"计数与
```
GET /api/conversations
```
的结果对比
验证创建新数据后数字更新

1.3 最近项

验证"最近对话"列表最多显示10项
验证"最近工作流运行"列表最多显示10项
点击一个对话 — 验证导航至
```
/chat/{id}
```
点击一个工作流运行 — 验证导航至
```
/workflows/runs/{id}
```
使用浏览器返回按钮 — 验证返回仪表盘

1.4 空状态

若没有对话/运行存在：验证显示带"New Chat"号召性用语的空状态
从空状态点击"New Chat" — 验证导航至
```
/chat
```

Test Suite 2: Project Management

测试套件2：项目管理

2.1 Add Project (GitHub URL)

Click the
```
+
```
button next to "Projects" in the sidebar

Fill in a GitHub URL (e.g.,

https://github.com/anthropics/claude-code

)

Submit and verify the project appears in the sidebar
Verify the project is auto-selected

2.2 Add Project (Local Path)

Click
```
+
```
again
Fill in a local path (e.g.,
```
/path/to/archon
```
)
Submit and verify the project appears
Verify deduplication: if the path was already registered, it should not create a duplicate

2.3 Select/Deselect Project

Click a project in the sidebar — verify it becomes selected (highlighted)
Verify the sidebar content switches to
```
ProjectDetail
```
view (shows project name, repo URL, conversations scoped to project, workflow runs)
Click "All Projects" — verify sidebar switches to
```
AllConversationsView
```
(all conversations, no project filter)
Verify
```
localStorage
```
persists selection across page refresh

2.4 Delete Project

Hover over a project — verify the trash icon appears
Click trash — verify confirmation dialog appears
Confirm deletion — verify project is removed from list
Verify conversations and runs associated with the project are handled gracefully

2.5 Project Selector in Collapsible

When a project is selected, verify the collapsible header shows the project name
Click the chevron to expand — verify other projects are listed
Switch projects via the collapsible — verify the view updates

2.1 添加项目（GitHub URL）

点击侧边栏中"Projects"旁的
```
+
```
按钮

填写GitHub URL（例如：

https://github.com/anthropics/claude-code

）

提交并验证项目出现在侧边栏中
验证项目被自动选中

2.2 添加项目（本地路径）

再次点击
```
+
```
填写本地路径（例如：
```
/path/to/archon
```
）
提交并验证项目出现
验证去重：若路径已注册，则不应创建重复项

2.3 选择/取消选择项目

点击侧边栏中的项目 — 验证它被选中（高亮）
验证侧边栏内容切换到
```
ProjectDetail
```
视图（显示项目名称、仓库URL、项目范围内的对话、工作流运行）
点击"All Projects" — 验证侧边栏切换到
```
AllConversationsView
```
（所有对话，无项目过滤）
验证
```
localStorage
```
在页面刷新后保留选择

2.4 删除项目

悬停在项目上 — 验证垃圾桶图标出现
点击垃圾桶 — 验证确认对话框出现
确认删除 — 验证项目从列表中移除
验证与项目关联的对话和运行被妥善处理

2.5 可折叠面板中的项目选择器

选中项目时，验证可折叠面板头部显示项目名称
点击 Chevron 展开 — 验证列出其他项目
通过可折叠面板切换项目 — 验证视图更新

Test Suite 3: Chat Interface

测试套件3：聊天界面

3.1 New Chat (No Project)

Click "New Chat" in sidebar (with no project selected)
Verify empty chat interface renders with message input
Type a message and send
Verify: user message appears right-aligned, assistant "thinking" dots appear
Verify: conversation is created and URL updates to
```
/chat/{conversationId}
```
Verify: conversation appears in sidebar

3.2 New Chat (With Project)

Select a project first
Click "New Chat"
Send a message (e.g.,
```
/status
```
)
Verify: conversation is scoped to the selected project
Verify: project context (cwd, codebase) is attached

3.3 Slash Commands

Send
```
/status
```
— verify response shows session status
Send
```
/help
```
— verify help text renders in markdown
Send
```
/commands
```
— verify command list renders
Send
```
/getcwd
```
— verify working directory is shown
Verify: commands execute instantly (no "thinking" animation needed)

3.4 Message Rendering

Send a message that triggers a markdown response from the AI
Verify: code blocks render with syntax highlighting
Verify: tables render properly in assistant messages
Verify: links open in new tabs (
```
target="_blank"
```
)
Verify: blockquotes render with left border
Verify: inline code renders with monospace font
Send a very long message — verify no layout overflow

3.5 Streaming & Real-time Updates

Send a message that triggers an AI response
Verify: blinking cursor appears during streaming
Verify: text appears incrementally (not all at once)
Verify: lock indicator shows "Agent is working..."
Verify: lock indicator hides when response completes
Verify: message
```
isStreaming
```
flag clears on completion

3.6 Tool Call Cards

Send a message that triggers tool usage (e.g., a code question in a project context)
Verify: tool call cards appear below the assistant message
Verify: card shows tool name and input summary
Click to expand a tool card — verify full input JSON and output render
Verify: running tools show spinner animation and primary border
Verify: completed tools show duration badge
Test "Show N more lines" for long tool outputs

3.7 Error Handling

Trigger an error condition (e.g., send a message with no AI credentials configured)
Verify: error card renders with AlertCircle icon
Verify: error classification badge shows (transient/fatal)
Verify: suggested actions are listed
Verify: the chat remains functional after an error

3.8 Queue Position

If possible, trigger multiple concurrent messages to the same conversation
Verify: queue position indicator appears ("Position N in queue")
Verify: the lock indicator updates when the queue advances

3.9 Auto-scroll Behavior

Scroll up during a streaming response
Verify: auto-scroll stops (respects user scroll position)
Verify: "Jump to bottom" button appears
Click "Jump to bottom" — verify scroll snaps to latest message
Scroll back to bottom manually — verify auto-scroll resumes

3.10 Conversation Navigation

Create multiple conversations
Click between them in the sidebar
Verify: each conversation loads its own message history
Verify: messages are not leaked between conversations
Verify: the correct conversation is highlighted in the sidebar

3.1 新聊天（无项目）

点击侧边栏中的"New Chat"（未选中项目时）
验证渲染空聊天界面及消息输入框
输入消息并发送
验证：用户消息右对齐显示，助手"思考"圆点出现
验证：对话已创建，URL更新为
```
/chat/{conversationId}
```
验证：对话出现在侧边栏中

3.2 新聊天（有项目）

先选中一个项目
点击"New Chat"
发送消息（例如：
```
/status
```
）
验证：对话限定在选中项目范围内
验证：项目上下文（cwd、代码库）已附加

3.3 斜杠命令

发送
```
/status
```
— 验证响应显示会话状态
发送
```
/help
```
— 验证帮助文本以Markdown格式渲染
发送
```
/commands
```
— 验证命令列表渲染
发送
```
/getcwd
```
— 验证显示工作目录
验证：命令立即执行（无需"思考"动画）

3.4 消息渲染

发送一条触发AI返回Markdown格式响应的消息
验证：代码块带语法高亮渲染
验证：表格在助手消息中正确渲染
验证：链接在新标签页打开（
```
target="_blank"
```
）
验证：块引用带左侧边框渲染
验证：行内代码等宽字体渲染
发送一条很长的消息 — 验证无布局溢出

3.5 流式传输与实时更新

发送一条触发AI响应的消息
验证：流式传输期间出现闪烁光标
验证：文本增量显示（并非一次性全部显示）
验证：锁定指示器显示"Agent is working..."
验证：响应完成后锁定指示器隐藏
验证：消息
```
isStreaming
```
标记在完成后清除

3.6 工具调用卡片

发送一条触发工具使用的消息（例如：项目上下文中的代码问题）
验证：工具调用卡片出现在助手消息下方
验证：卡片显示工具名称和输入摘要
点击展开工具卡片 — 验证渲染完整输入JSON和输出
验证：运行中的工具显示旋转动画和主边框
验证：已完成的工具显示时长徽章
测试长工具输出的"Show N more lines"

3.7 错误处理

触发错误条件（例如：发送未配置AI凭证的消息）
验证：错误卡片带AlertCircle图标渲染
验证：显示错误分类徽章（瞬态/致命）
验证：列出建议操作
验证：错误发生后聊天仍可正常使用

3.8 队列位置

若可能，向同一对话触发多条并发消息
验证：队列位置指示器出现（"Position N in queue"）
验证：队列前进时锁定指示器更新

3.9 自动滚动行为

流式响应期间向上滚动
验证：自动滚动停止（尊重用户滚动位置）
验证："Jump to bottom"按钮出现
点击"Jump to bottom" — 验证滚动跳转到最新消息
手动滚动回底部 — 验证自动滚动恢复

3.10 对话导航

创建多个对话
在侧边栏中点击切换对话
验证：每个对话加载自己的消息历史
验证：消息不会在对话间泄露
验证：侧边栏中高亮显示正确的对话

Test Suite 4: Conversation Management

测试套件4：对话管理

4.1 Rename Conversation

Hover over a conversation in the sidebar — verify pencil icon appears
Click pencil — verify inline edit input appears
Type a new title and press Enter
Verify: title updates in sidebar and in the chat header
Press Escape during rename — verify it cancels without saving

4.2 Delete Conversation

Hover over a conversation — verify trash icon appears
Click trash — verify confirmation dialog appears
Confirm deletion — verify conversation is removed
If the deleted conversation was active: verify redirect to
```
/
```
Verify: soft-delete (conversation is hidden, not destroyed)

4.3 Auto-title

Create a new conversation and send a non-command message
Wait 2-3 seconds
Verify: the conversation title updates automatically based on the first message
Verify: title is truncated to ~80 characters

4.4 Search

Type in the sidebar search bar
Verify: conversations are filtered by title match
Clear search — verify all conversations reappear
Press
```
/
```
key — verify search input focuses
Press Escape — verify search clears

4.1 重命名对话

悬停在侧边栏中的对话上 — 验证铅笔图标出现
点击铅笔 — 验证出现行内编辑输入框
输入新标题并按Enter
验证：侧边栏和聊天标题中的标题更新
重命名期间按Escape — 验证取消操作且不保存

4.2 删除对话

悬停在对话上 — 验证垃圾桶图标出现
点击垃圾桶 — 验证确认对话框出现
确认删除 — 验证对话被移除
若被删除的对话处于激活状态：验证重定向至
```
/
```
验证：软删除（对话被隐藏，未销毁）

4.3 自动标题

创建新对话并发送非命令消息
等待2-3秒
验证：对话标题根据第一条消息自动更新
验证：标题截断至约80字符

4.4 搜索

在侧边栏搜索栏中输入内容
验证：对话按标题匹配过滤
清除搜索 — 验证所有对话重新出现
按
```
/
```
键 — 验证搜索输入框获得焦点
按Escape — 验证搜索内容清除

Test Suite 5: Workflow Management

测试套件5：工作流管理

5.1 Workflow List Page (
/workflows
)

Navigate to
```
/workflows
```
Verify: "Available Workflows" tab shows all discovered workflows
Verify: each workflow card shows name and description
Verify: "Recent Runs" tab shows recent workflow runs
Verify: running workflows show a pulsing dot on the "Recent Runs" tab label

5.2 Invoke Workflow from Workflows Page

Click on a workflow card (e.g.,
```
archon-assist
```
)
Verify: inline run panel expands with project selector and message input
Select a project from the dropdown
Type a message and click "Run"
Verify: conversation is created and navigation goes to
```
/chat/{conversationId}
```
Verify: workflow execution begins (messages appear from the AI)

5.3 Invoke Workflow from Sidebar (WorkflowInvoker)

Select a project in the sidebar
Verify: workflow dropdown appears in
```
ProjectDetail
```
view
Select a workflow from the dropdown
Type a message and submit
Verify: new conversation created, navigation to chat, workflow runs

5.4 Workflow Router (Agent Orchestrator)

In a project chat, send a natural language message (e.g., "Help me understand the authentication flow")
Verify: the router detects the intent and routes to the appropriate workflow
Verify: workflow dispatch status message appears (e.g., "Dispatching workflow: archon-assist (background)")
Verify:
```
WorkflowDispatchInline
```
badge appears with spinner
Verify: clicking the dispatch badge navigates to the workflow run or worker conversation

5.5 Workflow Progress in Chat

While a workflow is running, verify
```
WorkflowProgressCard
```
appears in the chat
Verify: compact mode shows workflow name, step count, elapsed time
Verify: elapsed timer updates every second
Click "Open Full View" — verify navigation to
```
/workflows/runs/{runId}
```
Verify: returning to chat still shows the progress card

5.6 Workflow Execution Page (
/workflows/runs/:runId
)

Navigate to an active or completed workflow run
Verify: header shows workflow name, status, and elapsed time
Verify: step progress panel (left side) shows all steps with status icons
Click different steps — verify the log panel (right side) updates
Verify: "Chat" link back to parent conversation works
For dispatched workflows: verify
```
WorkflowLogs
```
renders the worker conversation messages

5.7 Parallel Agent Steps

Run a workflow with parallel agents (e.g.,
```
archon-comprehensive-pr-review
```
has 5 parallel agents)
Verify:
```
ParallelBlockView
```
renders showing parent step and nested agent list
Verify: each agent shows its own status (pending/running/completed/failed)
Verify: overall block status derives correctly (any failed = failed, any running = running, all complete = complete)
Verify: progress counter shows
```
(completed/total agents)
```

5.8 Loop Iterations

Run a loop workflow (e.g.,
```
archon-test-loop
```
or
```
archon-ralph-fresh
```
)
Verify:
```
LoopIterationView
```
renders with iteration counter
Verify: progress bar fills proportionally (current/max)
Verify: each iteration shows status
Verify: completion signal (
```
<promise>COMPLETE</promise>
```
) ends the loop

5.9 Workflow Artifacts

After a workflow completes that produces artifacts (PR URLs, commits, branches)
Verify:
```
ArtifactSummary
```
component renders at the bottom
Verify: URLs are clickable links opening in new tabs
Verify: artifact type icons are correct (PR, Commit, Branch, File)

5.10 Workflow Stale Detection

During a running workflow, if the SSE connection drops briefly
Verify:
```
stale
```
indicator appears on the workflow card
Verify: polling fallback kicks in (checks every 15 seconds)
Verify: stale state clears when fresh data arrives

5.11 Cancel Workflow

While a workflow is running, look for "Cancel" button
If present: click and verify the workflow status changes to failed/cancelled
If not present: note this as a UX gap

5.1 工作流列表页面（
/workflows
）

导航至
```
/workflows
```
验证："Available Workflows"标签页显示所有已发现的工作流
验证：每个工作流卡片显示名称和描述
验证："Recent Runs"标签页显示最近的工作流运行
验证：运行中的工作流在"Recent Runs"标签页标签上显示脉动点

5.2 从工作流页面调用工作流

点击工作流卡片（例如：
```
archon-assist
```
）
验证：展开行内运行面板，包含项目选择器和消息输入框
从下拉菜单中选择项目
输入消息并点击"Run"
验证：对话已创建，导航至
```
/chat/{conversationId}
```
验证：工作流执行开始（AI发送消息）

5.3 从侧边栏调用工作流（WorkflowInvoker）

在侧边栏中选择项目
验证：
```
ProjectDetail
```
视图中出现工作流下拉菜单
从下拉菜单中选择工作流
输入消息并提交
验证：创建新对话，导航至聊天界面，工作流运行

5.4 工作流路由器（Agent Orchestrator）

在项目聊天中，发送自然语言消息（例如："Help me understand the authentication flow"）
验证：路由器检测意图并路由至合适的工作流
验证：显示工作流调度状态消息（例如："Dispatching workflow: archon-assist (background)"）
验证：出现带旋转器的
```
WorkflowDispatchInline
```
徽章
验证：点击调度徽章导航至工作流运行或Worker对话

5.5 聊天中的工作流进度

工作流运行期间，验证聊天中出现
```
WorkflowProgressCard
```
验证：紧凑模式显示工作流名称、步骤数、耗时
验证：耗时计时器每秒更新
点击"Open Full View" — 验证导航至
```
/workflows/runs/{runId}
```
验证：返回聊天仍显示进度卡片

5.6 工作流执行页面（
/workflows/runs/:runId
）

导航至活跃或已完成的工作流运行
验证：头部显示工作流名称、状态和耗时
验证：步骤进度面板（左侧）显示所有步骤及状态图标
点击不同步骤 — 验证日志面板（右侧）更新
验证：返回父对话的"Chat"链接有效
对于已调度的工作流：验证
```
WorkflowLogs
```
渲染Worker对话消息

5.7 并行Agent步骤

运行带并行Agent的工作流（例如：
```
archon-comprehensive-pr-review
```
有5个并行Agent）
验证：渲染
```
ParallelBlockView
```
，显示父步骤和嵌套Agent列表
验证：每个Agent显示自己的状态（待处理/运行中/已完成/失败）
验证：整体块状态正确推导（任意失败=失败，任意运行中=运行中，全部完成=完成）
验证：进度计数器显示
```
(completed/total agents)
```

5.8 循环迭代

运行循环工作流（例如：
```
archon-test-loop
```
或
```
archon-ralph-fresh
```
）
验证：渲染
```
LoopIterationView
```
，带迭代计数器
验证：进度条按比例填充（当前/最大值）
验证：每个迭代显示状态
验证：完成信号（
```
<promise>COMPLETE</promise>
```
）终止循环

5.9 工作流产物

完成产生产物（PR URL、提交、分支）的工作流后
验证：底部渲染
```
ArtifactSummary
```
组件
验证：URL为可点击链接，在新标签页打开
验证：产物类型图标正确（PR、提交、分支、文件）

5.10 工作流 stale 检测

工作流运行期间，若SSE连接短暂断开
验证：工作流卡片上出现
```
stale
```
指示器
验证：轮询回退启动（每15秒检查一次）
验证：获取新数据后stale状态清除

5.11 取消工作流

工作流运行期间，查找"Cancel"按钮
若存在：点击并验证工作流状态变为失败/已取消
若不存在：记录为UX缺口

Test Suite 6: Project-Scoped Views

测试套件6：项目范围视图

6.1 Project Detail — Conversations

Select a project
Verify: only conversations scoped to that project appear
Create a new chat within the project
Verify: the new conversation appears in the filtered list
Verify: conversations from other projects are NOT shown

6.2 Project Detail — Workflow Runs

Verify: workflow runs scoped to the selected project appear
Verify: runs are sorted by priority: failed > running > completed
Click a run — verify navigation to
```
/workflows/runs/{id}
```
Verify: conversation status dots show on conversations with active runs

6.3 Cross-Project Navigation

Start a workflow in Project A
Switch to Project B in the sidebar
Verify: Project A's workflow is not visible in Project B's view
Switch back to Project A — verify the workflow run is still visible
Click "All Projects" — verify you can see conversations from all projects

6.1 项目详情 — 对话

选择项目
验证：仅显示该项目范围内的对话
在项目内创建新聊天
验证：新对话出现在过滤后的列表中
验证：不显示其他项目的对话

6.2 项目详情 — 工作流运行

验证：显示选中项目范围内的工作流运行
验证：运行按优先级排序：失败 > 运行中 > 已完成
点击运行 — 验证导航至
```
/workflows/runs/{id}
```
验证：带活跃运行的对话显示状态点

6.3 跨项目导航

在项目A中启动工作流
在侧边栏中切换到项目B
验证：项目A的工作流在项目B的视图中不可见
切换回项目A — 验证工作流运行仍可见
点击"All Projects" — 验证可查看所有项目的对话

Test Suite 7: SSE & Real-time Infrastructure

测试套件7：SSE与实时基础设施

7.1 SSE Connection

Open browser DevTools Network tab (via
```
agent-browser eval
```
or console)
Verify: EventSource connection to
```
/api/stream/{conversationId}
```
is established
Verify: heartbeat events arrive every ~30 seconds
Verify: connection state is OPEN (readyState 1)

7.2 SSE Reconnection

Kill the backend server temporarily
Verify: the UI shows a disconnected state (grey dot in header)
Restart the backend
Verify: SSE reconnects automatically
Verify: the connection indicator turns green again
Verify: buffered messages are delivered on reconnect

7.3 Multiple Tabs

Open the same conversation in two browser tabs (use
```
agent-browser --session
```
for parallel)
Send a message from tab 1
Verify: response streams in BOTH tabs (SSE fan-out via stream registry replacement)
Note: the web adapter replaces old streams on new connections, so only the latest tab gets live SSE

7.1 SSE连接

打开浏览器开发者工具网络标签（通过
```
agent-browser eval
```
或控制台）
验证：建立与
```
/api/stream/{conversationId}
```
的EventSource连接
验证：心跳事件约每30秒到达一次
验证：连接状态为OPEN（readyState 1）

7.2 SSE重连

临时终止后端服务器
验证：UI显示断开连接状态（头部灰色点）
重启后端
验证：SSE自动重连
验证：连接指示器再次变为绿色
验证：重连后传递缓冲消息

7.3 多标签页

在两个浏览器标签页中打开同一对话（使用
```
agent-browser --session
```
实现并行）
从标签页1发送消息
验证：响应在两个标签页中流式传输（通过流注册表替换实现SSE扇出）
注意：Web适配器在新连接时替换旧流，因此只有最新标签页获得实时SSE

Test Suite 8: UI/UX Quality Audit

测试套件8：UI/UX质量审计

8.1 Visual Hierarchy & Dark Theme

Screenshot the full app at different states
Verify: text hierarchy (primary/secondary/tertiary) is readable
Verify: interactive elements have clear hover states
Verify: accent colors (blue-purple) are used consistently
Verify: success (green), warning (amber), error (red) colors are correct
Verify: borders and dividers create clear visual separation

8.2 Loading States

Observe loading states when:
- Dashboard is loading
- Conversation messages are loading
- Workflows list is loading
- Workflow runs are fetching
Verify: all loading states show appropriate feedback (spinners, skeletons, or text)
Verify: no blank/flash-of-unstyled-content moments

8.3 Empty States

Check empty states for:
- No conversations (dashboard + sidebar)
- No projects registered
- No workflows available
- No workflow runs
- No messages in a conversation
Verify: each empty state has a helpful message and CTA

8.4 Responsiveness

Set viewport to different sizes:

bash

agent-browser set viewport 1920 1080  # Desktop
agent-browser set viewport 1366 768   # Laptop
agent-browser set viewport 1024 768   # Tablet landscape
agent-browser set viewport 768 1024   # Tablet portrait
agent-browser set viewport 375 812    # Mobile

At each size: screenshot and check for layout breakage, overflow, truncation

8.5 Sidebar Resize

Drag the sidebar resize handle
Verify: sidebar width changes smoothly (240-400px range)
Verify: width persists in localStorage across refresh
Verify: content reflows properly at different sidebar widths

8.6 Keyboard Navigation

Press
```
/
```
— verify search focuses
Press
```
Escape
```
— verify search clears
Press
```
Enter
```
in message input — verify sends message
Press
```
Shift+Enter
```
— verify inserts newline (does NOT send)
Tab through interactive elements — verify focus order is logical

8.7 Copy/Clipboard

Click the working directory path in the chat header
Verify: path copies to clipboard
Verify: visual feedback (tooltip or flash) indicates copy succeeded

8.8 External Links

Click "Open in IDE" button (VSCode link)
Verify:
```
vscode://file/...
```
URL is constructed correctly
Click links in assistant messages — verify they open in new tabs

8.1 视觉层次与深色主题

在不同状态下截图完整应用
验证：文本层次（主/次/三级）可读
验证：交互元素有清晰的悬停状态
验证：强调色（蓝紫色）使用一致
验证：成功（绿色）、警告（琥珀色）、错误（红色）颜色正确
验证：边框和分隔线创建清晰的视觉分隔

8.2 加载状态

观察以下场景的加载状态：
- 仪表盘加载
- 对话消息加载
- 工作流列表加载
- 工作流运行获取中
验证：所有加载状态显示适当反馈（旋转器、骨架屏或文本）
验证：无空白/未样式内容闪现的情况

8.3 空状态

检查以下场景的空状态：
- 无对话（仪表盘 + 侧边栏）
- 无已注册项目
- 无可用工作流
- 无工作流运行
- 对话中无消息
验证：每个空状态都有有用的消息和号召性用语

8.4 响应式

设置不同视口尺寸：

bash

agent-browser set viewport 1920 1080  # Desktop
agent-browser set viewport 1366 768   # Laptop
agent-browser set viewport 1024 768   # Tablet landscape
agent-browser set viewport 768 1024   # Tablet portrait
agent-browser set viewport 375 812    # Mobile

每个尺寸下：截图并检查布局断裂、溢出、截断情况

8.5 侧边栏调整大小

拖动侧边栏调整手柄
验证：侧边栏宽度平滑变化（范围240-400px）
验证：宽度在页面刷新后通过localStorage保留
验证：不同侧边栏宽度下内容正确重排

8.6 键盘导航

按
```
/
```
— 验证搜索框获得焦点
按
```
Escape
```
— 验证搜索内容清除
在消息输入框按
```
Enter
```
— 验证发送消息
按
```
Shift+Enter
```
— 验证插入换行（不发送）
按Tab键遍历交互元素 — 验证焦点顺序符合逻辑

8.7 复制/剪贴板

点击聊天头部的工作目录路径
验证：路径复制到剪贴板
验证：视觉反馈（提示框或闪烁）表明复制成功

8.8 外部链接

点击"Open in IDE"按钮（VSCode链接）
验证：正确构造
```
vscode://file/...
```
URL
点击助手消息中的链接 — 验证在新标签页打开

Test Suite 9: Edge Cases & Stress Tests

测试套件9：边缘情况与压力测试

9.1 Rapid Message Sending

Send multiple messages in quick succession (before previous responses complete)
Verify: messages are queued properly (no duplicate or lost messages)
Verify: lock indicator shows queue position
Verify: responses arrive in order

9.2 Long Content

Send a message that produces very long output (e.g., "List all files in the project")
Verify: markdown renders without layout overflow
Verify: code blocks have horizontal scroll
Verify:
```
WorkflowResultCard
```
truncation works (500 chars / 8 lines with "Show more")
Verify: tool call output truncation works (20 lines shown, expandable)

9.3 Special Characters

Send messages with special characters:
```
<script>alert('xss')</script>
```
, markdown chars
```
*_[]()
```
, emoji
Verify: no XSS vulnerability (HTML is escaped)
Verify: markdown renders correctly
Verify: emoji displays properly

9.4 Browser Refresh During Streaming

While AI is streaming a response, refresh the page
Verify: on reload, historical messages are loaded from the API
Verify: any in-progress response is not lost (persisted segments appear)
Verify: SSE reconnects and picks up new events

9.5 Concurrent Workflows

Launch 2-3 workflows simultaneously (different projects or same project)
Verify: each workflow tracks independently
Verify: workflow progress cards in respective chats are correct
Verify: no cross-contamination of events between workflows

9.6 Network Latency

Add artificial network latency if possible
Verify: UI remains responsive during slow responses
Verify: loading indicators appear for slow API calls
Verify: no timeout errors in normal usage

9.1 快速发送消息

快速连续发送多条消息（在前一个响应完成前）
验证：消息正确排队（无重复或丢失消息）
验证：锁定指示器显示队列位置
验证：响应按顺序到达

9.2 长内容

发送产生超长输出的消息（例如："List all files in the project"）
验证：Markdown渲染无布局溢出
验证：代码块有水平滚动
验证：
```
WorkflowResultCard
```
截断功能正常（500字符/8行，带"Show more"）
验证：工具调用输出截断功能正常（显示20行，可展开）

9.3 特殊字符

发送含特殊字符的消息：
```
<script>alert('xss')</script>
```
、Markdown字符
```
*_[]()
```
、表情符号
验证：无XSS漏洞（HTML已转义）
验证：Markdown正确渲染
验证：表情符号正常显示

9.4 流式传输期间浏览器刷新

AI流式响应期间，刷新页面
验证：重新加载时，从API加载历史消息
验证：任何进行中的响应未丢失（显示已持久化的片段）
验证：SSE重连并接收新事件

9.5 并发工作流

同时启动2-3个工作流（不同项目或同一项目）
验证：每个工作流独立跟踪
验证：各自聊天中的工作流进度卡片正确
验证：工作流间无事件交叉污染

9.6 网络延迟

若可能，添加人工网络延迟
验证：缓慢响应期间UI保持响应
验证：缓慢API调用时显示加载指示器
验证：正常使用中无超时错误

Phase 2: Codebase Review

阶段2：代码库审查

Read the source code of every component and module listed below. For each, evaluate:

Correctness: Are there logic bugs, race conditions, or broken state transitions?
UX quality: Does the component provide good feedback, handle edge cases, feel polished?
Performance: Are there unnecessary re-renders, missing memoization, or expensive operations?
Accessibility: Are interactive elements properly labeled? Keyboard navigable?
Error handling: Are errors caught, displayed, and recoverable?

阅读以下列出的每个组件和模块的源代码。针对每个项，评估：

正确性：是否存在逻辑漏洞、竞争条件或损坏的状态转换？
UX质量：组件是否提供良好反馈、处理边缘情况、感觉精致？
性能：是否存在不必要的重渲染、缺少 memoization 或昂贵操作？
可访问性：交互元素是否正确标记？是否支持键盘导航？
错误处理：错误是否被捕获、显示并可恢复？

Frontend Files to Review

前端待审查文件

File	Focus Areas
`packages/web/src/App.tsx`	Route config, error boundary, QueryClient settings
`packages/web/src/components/chat/ChatInterface.tsx`	SSE handler correctness, message state management, new-chat flow, workflow dispatch handling
`packages/web/src/components/chat/MessageList.tsx`	Auto-scroll, WorkflowDispatchInline polling, WorkflowResultCard truncation
`packages/web/src/components/chat/MessageBubble.tsx`	Markdown rendering, streaming cursor, thinking dots
`packages/web/src/components/chat/ToolCallCard.tsx`	Expand/collapse, running state animation, output truncation
`packages/web/src/components/chat/WorkflowProgressCard.tsx`	Timer accuracy, compact vs full mode, stale indicator
`packages/web/src/components/chat/LockIndicator.tsx`	Show/hide transitions, queue position display
`packages/web/src/components/chat/MessageInput.tsx`	Enter vs Shift+Enter, auto-resize, disabled state
`packages/web/src/components/layout/Sidebar.tsx`	Resize drag, project add flow, search, new chat, localStorage persistence
`packages/web/src/components/sidebar/ProjectDetail.tsx`	Scoped queries, conversation status dots, workflow run sorting
`packages/web/src/components/sidebar/WorkflowInvoker.tsx`	Workflow fetch, create conversation + run flow, error handling
`packages/web/src/components/sidebar/AllConversationsView.tsx`	Search filtering, codebase map construction, "New Chat"
`packages/web/src/components/sidebar/ProjectSelector.tsx`	Delete confirmation, "All Projects" button
`packages/web/src/components/conversations/ConversationItem.tsx`	Rename inline edit, delete flow, active state highlighting
`packages/web/src/components/workflows/WorkflowList.tsx`	Two-tab layout, inline run panel, running indicator pulse
`packages/web/src/components/workflows/WorkflowExecution.tsx`	Initial data reconstruction from events, live SSE overlay, worker vs parent flows
`packages/web/src/components/workflows/WorkflowLogs.tsx`	Read-only chat view, SSE handlers, message filtering by timestamp
`packages/web/src/components/workflows/StepProgress.tsx`	Step list rendering, parallel block delegation, active step highlight
`packages/web/src/components/workflows/StepLogs.tsx`	Virtual scrolling, auto-scroll, metadata header
`packages/web/src/components/workflows/ArtifactSummary.tsx`	Artifact type icons, URL links, path display
`packages/web/src/components/workflows/LoopIterationView.tsx`	Progress bar math, max iteration capping
`packages/web/src/components/workflows/ParallelBlockView.tsx`	Overall status derivation, nested agent list
`packages/web/src/hooks/useSSE.ts`	Text batching (50ms flush), reconnection, handler ref stability
`packages/web/src/hooks/useWorkflowStatus.ts`	Workflow state map, polling fallback (15s), stale detection
`packages/web/src/hooks/useAutoScroll.ts`	Scroll threshold (50px), user scroll-up detection
`packages/web/src/lib/api.ts`	SSE_BASE_URL calculation, error handling, 404 swallowing
`packages/web/src/lib/types.ts`	SSEEvent union completeness, ChatMessage fields, WorkflowState shape
`packages/web/src/lib/message-cache.ts`	Cache key correctness, memory management
`packages/web/src/contexts/ProjectContext.tsx`	Stale project ID cleanup, codebase polling interval

文件	聚焦领域
`packages/web/src/App.tsx`	路由配置、错误边界、QueryClient设置
`packages/web/src/components/chat/ChatInterface.tsx`	SSE处理正确性、消息状态管理、新聊天流程、工作流调度处理
`packages/web/src/components/chat/MessageList.tsx`	自动滚动、WorkflowDispatchInline轮询、WorkflowResultCard截断
`packages/web/src/components/chat/MessageBubble.tsx`	Markdown渲染、流式光标、思考圆点
`packages/web/src/components/chat/ToolCallCard.tsx`	展开/折叠、运行状态动画、输出截断
`packages/web/src/components/chat/WorkflowProgressCard.tsx`	计时器准确性、紧凑/完整模式、stale指示器
`packages/web/src/components/chat/LockIndicator.tsx`	显示/隐藏过渡、队列位置显示
`packages/web/src/components/chat/MessageInput.tsx`	Enter与Shift+Enter、自动调整大小、禁用状态
`packages/web/src/components/layout/Sidebar.tsx`	调整大小拖动、项目添加流程、搜索、新聊天、localStorage持久化
`packages/web/src/components/sidebar/ProjectDetail.tsx`	范围查询、对话状态点、工作流运行排序
`packages/web/src/components/sidebar/WorkflowInvoker.tsx`	工作流获取、创建对话+运行流程、错误处理
`packages/web/src/components/sidebar/AllConversationsView.tsx`	搜索过滤、代码库映射构建、"New Chat"
`packages/web/src/components/sidebar/ProjectSelector.tsx`	删除确认、"All Projects"按钮
`packages/web/src/components/conversations/ConversationItem.tsx`	行内编辑重命名、删除流程、活跃状态高亮
`packages/web/src/components/workflows/WorkflowList.tsx`	双标签页布局、行内运行面板、运行指示器脉动
`packages/web/src/components/workflows/WorkflowExecution.tsx`	从事件重建初始数据、实时SSE覆盖、Worker与父流程
`packages/web/src/components/workflows/WorkflowLogs.tsx`	只读聊天视图、SSE处理程序、按时间戳过滤消息
`packages/web/src/components/workflows/StepProgress.tsx`	步骤列表渲染、并行块委托、活跃步骤高亮
`packages/web/src/components/workflows/StepLogs.tsx`	虚拟滚动、自动滚动、元数据头部
`packages/web/src/components/workflows/ArtifactSummary.tsx`	产物类型图标、URL链接、路径显示
`packages/web/src/components/workflows/LoopIterationView.tsx`	进度条计算、最大迭代限制
`packages/web/src/components/workflows/ParallelBlockView.tsx`	整体状态推导、嵌套Agent列表
`packages/web/src/hooks/useSSE.ts`	文本批处理（50ms刷新）、重连、处理程序引用稳定性
`packages/web/src/hooks/useWorkflowStatus.ts`	工作流状态映射、轮询回退（15秒）、stale检测
`packages/web/src/hooks/useAutoScroll.ts`	滚动阈值（50px）、用户向上滚动检测
`packages/web/src/lib/api.ts`	SSE_BASE_URL计算、错误处理、404忽略
`packages/web/src/lib/types.ts`	SSEEvent联合完整性、ChatMessage字段、WorkflowState形状
`packages/web/src/lib/message-cache.ts`	缓存键正确性、内存管理
`packages/web/src/contexts/ProjectContext.tsx`	Stale项目ID清理、代码库轮询间隔

Backend Files to Review

后端待审查文件

File	Focus Areas
`packages/server/src/routes/api.ts`	Endpoint correctness, CORS, SSE heartbeat loop, workflow run endpoint, codebase deduplication
`packages/server/src/adapters/web.ts`	sendMessage category filtering, structured event handling, lock event flushing
`packages/server/src/adapters/web/persistence.ts`	Segment splitting logic, tool call duration tracking, flush timing, 50-segment cap
`packages/server/src/adapters/web/transport.ts`	Stream replacement race condition fix, buffer limits (100 msg / 200 conv), zombie reaper
`packages/server/src/adapters/web/workflow-bridge.ts`	Event mapping completeness, bridge subscription lifecycle, parent forwarding
`packages/core/src/orchestrator/orchestrator.ts`	Router prompt construction, background dispatch fire-and-forget, isolation resolution
`packages/core/src/workflows/executor.ts`	Stale workflow detection (15min), step session continuity, parallel Promise.all, loop completion signal
`packages/core/src/workflows/router.ts`	Case-insensitive matching, multiline regex, fallback behavior
`packages/core/src/workflows/event-emitter.ts`	Listener error isolation, max listener cap, run registration lifecycle

文件	聚焦领域
`packages/server/src/routes/api.ts`	端点正确性、CORS、SSE心跳循环、工作流运行端点、代码库去重
`packages/server/src/adapters/web.ts`	sendMessage类别过滤、结构化事件处理、锁定事件刷新
`packages/server/src/adapters/web/persistence.ts`	分段拆分逻辑、工具调用时长跟踪、刷新时机、50分段上限
`packages/server/src/adapters/web/transport.ts`	流替换竞争条件修复、缓冲区限制（100消息/200对话）、僵尸回收
`packages/server/src/adapters/web/workflow-bridge.ts`	事件映射完整性、桥接订阅生命周期、父级转发
`packages/core/src/orchestrator/orchestrator.ts`	路由器提示构建、后台调度即发即弃、隔离解析
`packages/core/src/workflows/executor.ts`	Stale工作流检测（15分钟）、步骤会话连续性、并行Promise.all、循环完成信号
`packages/core/src/workflows/router.ts`	不区分大小写匹配、多行正则、回退行为
`packages/core/src/workflows/event-emitter.ts`	监听器错误隔离、最大监听器上限、运行注册生命周期

Review Checklist

审查清单

For every file reviewed, note findings in these categories:

Bugs — Logic errors, race conditions, state inconsistencies, crashes
UX Issues — Missing feedback, confusing interactions, unclear states, dead ends
Performance — Unnecessary re-renders, missing React.memo/useMemo/useCallback, expensive computations in render
Accessibility — Missing ARIA labels, focus management gaps, screen reader issues
Error Handling — Unhandled promise rejections, missing try/catch, silent failures
Code Quality — Dead code, TODOs, inconsistent patterns, missing types

对于每个审查的文件，按以下类别记录发现：

漏洞 — 逻辑错误、竞争条件、状态不一致、崩溃
UX问题 — 缺少反馈、交互混淆、状态不清晰、死胡同
性能 — 不必要的重渲染、缺少React.memo/useMemo/useCallback、渲染中的昂贵计算
可访问性 — 缺少ARIA标签、焦点管理缺口、屏幕阅读器问题
错误处理 — 未处理的Promise拒绝、缺少try/catch、静默失败
代码质量 — 死代码、TODO、不一致模式、缺少类型

Phase 3: Report

阶段3：报告

After completing all tests and reviews, produce a structured report:

完成所有测试和审查后，生成结构化报告：

Report Format

报告格式

markdown

undefined

markdown

undefined

Archon Web UI Validation Report

Archon Web UI验证报告

Date: {date} Tester: Claude Code (agent-browser + codebase review) Archon Version: {git commit hash} Screenshots: /tmp/archon-test-*.png

日期：{date} 测试者：Claude Code（agent-browser + 代码库审查） Archon版本：{git commit hash} 截图：/tmp/archon-test-*.png

Executive Summary

执行摘要

{2-3 sentences: overall quality assessment, critical issues count, UX rating}

{2-3句话：整体质量评估、关键问题数量、UX评分}

Critical Bugs (P0)

关键漏洞（P0）

{Bugs that break core functionality or lose data}

{破坏核心功能或丢失数据的漏洞}

Major Issues (P1)

主要问题（P1）

{Issues that significantly degrade the experience}

{显著降低体验的问题}

Minor Issues (P2)

次要问题（P2）

{Polish items, edge cases, visual inconsistencies}

{优化项、边缘情况、视觉不一致}

UX Recommendations

UX建议

{Suggestions for improving the user experience — not just bugs but "could be better"}

{改善用户体验的建议 — 不仅是漏洞，还有"可以更好"的地方}

Accessibility Findings

可访问性发现

{Keyboard nav gaps, ARIA issues, contrast problems}

{键盘导航缺口、ARIA问题、对比度问题}

Performance Observations

性能观察

{Slow renders, unnecessary work, optimization opportunities}

{缓慢渲染、不必要的工作、优化机会}

Codebase Quality Notes

代码库质量说明

{Dead code, inconsistencies, architectural concerns}

{死代码、不一致性、架构问题}

What's Working Well

表现良好的方面

{Positive findings — features that are solid, patterns that are good}

{积极发现 — 稳定的功能、良好的模式}

Detailed Test Results

详细测试结果

Dashboard Tests

仪表盘测试

Test	Status	Notes
1.1 Initial Load	PASS/FAIL	...
...

测试项	状态	备注
1.1 初始加载	PASS/FAIL	...
...

Project Management Tests

项目管理测试

...

Chat Interface Tests

聊天界面测试

...

Workflow Management Tests

工作流管理测试

...

undefined

...

undefined

Key Question to Answer

关键问题

Is Archon currently doing the best it possibly can to solve the problem of managing a lot of agents in parallel and executing custom workflows with full visibility?

Specifically evaluate:

Can users easily see what all their agents are doing at a glance?
Is workflow status visible and understandable without clicking through multiple pages?
Can users quickly navigate between the orchestrator chat, individual workflow runs, and task logs?
Is the experience of kicking off a workflow through the router intuitive?
Are parallel agents presented clearly with their individual status?
Does the UI surface errors and issues prominently enough?
Is the overall information architecture logical for someone managing 5-10 concurrent agents?

Archon当前是否在最大程度上解决了并行Agent管理和自定义工作流执行的全可见性问题？

具体评估：

用户能否一目了然地看到所有Agent的工作状态？
工作流状态是否可见且无需点击多个页面即可理解？
用户能否快速在编排器聊天、单个工作流运行和任务日志之间导航？
通过路由器启动工作流的体验是否直观？
并行Agent是否清晰展示各自的状态？
UI是否足够突出地显示错误和问题？
对于管理5-10个并发Agent的用户，整体信息架构是否合理？

Execution Notes

执行说明

Run all
```
agent-browser
```
commands via the Bash tool
Use
```
npx agent-browser
```
if not installed globally
After each navigation, re-snapshot (
```
agent-browser snapshot -i
```
) to get fresh refs
Take screenshots liberally — save to
```
/tmp/archon-test-{section}-{name}.png
```
If a test fails, document it immediately and continue to the next test
Use
```
agent-browser wait --load networkidle
```
after actions that trigger API calls
For SSE testing, use
```
agent-browser eval
```
to check EventSource state
Remember: WSL2 headless mode works fine — no display server needed
Close the browser session when done:
```
agent-browser close
```

通过Bash工具运行所有
```
agent-browser
```
命令
若未全局安装，使用
```
npx agent-browser
```
每次导航后，重新快照（
```
agent-browser snapshot -i
```
）以获取最新引用
大量截图 — 保存至
```
/tmp/archon-test-{section}-{name}.png
```
若测试失败，立即记录并继续下一个测试
触发API调用的操作后，使用
```
agent-browser wait --load networkidle
```
SSE测试时，使用
```
agent-browser eval
```
检查EventSource状态
注意：WSL2无头模式运行正常 — 无需显示服务器
完成后关闭浏览器会话：
```
agent-browser close
```

validate-ui

Original

Translation

Archon Web UI — Comprehensive E2E Validation

Archon Web UI — 全面端到端验证

Phase 0: Environment Setup

阶段0：环境搭建

0.1 Kill Old Archon Processes

0.1 终止旧的Archon进程

Kill any running Archon dev servers (backend + frontend)

Kill any running Archon dev servers (backend + frontend)

Kill any leftover processes on our ports

Kill any leftover processes on our ports

Wait for ports to free up

Wait for ports to free up

Verify ports are free

Verify ports are free

0.2 Install agent-browser (if needed)

0.2 安装agent-browser（如需）

Check if agent-browser is available

Check if agent-browser is available

If not installed globally, install it:

If not installed globally, install it:

npm install -g agent-browser && agent-browser install

npm install -g agent-browser && agent-browser install

On WSL2/Linux, use --with-deps to get Chromium system dependencies:

On WSL2/Linux, use --with-deps to get Chromium system dependencies:

agent-browser install --with-deps

agent-browser install --with-deps

IMPORTANT: Do NOT use bunx — Bun skips postinstall scripts that agent-browser needs.

IMPORTANT: Do NOT use bunx — Bun skips postinstall scripts that agent-browser needs.

Use npx or global npm install.

Use npx or global npm install.

0.3 Start Archon Backend + Frontend

0.3 启动Archon后端 + 前端

From the repo root: /path/to/archon

From the repo root: /path/to/archon

Start backend (port 3090)

Start backend (port 3090)

Verify backend is healthy

Verify backend is healthy

Start frontend (port 5173)

Start frontend (port 5173)

Verify frontend is serving

Verify frontend is serving

0.4 Seed Test Data (if needed)

0.4 植入测试数据（如需）

Check existing codebases

Check existing codebases

Register the current repo as a codebase (if none exist)

Register the current repo as a codebase (if none exist)

Create a test conversation

Create a test conversation

Phase 1: Browser Automation — End-to-End Testing

阶段1：浏览器自动化 — 端到端测试

Test Suite 1: Dashboard (Route: /)

测试套件1：仪表盘（路由：/）

Test Suite 2: Project Management

测试套件2：项目管理

Test Suite 3: Chat Interface

测试套件3：聊天界面

Test Suite 4: Conversation Management

测试套件4：对话管理

Test Suite 5: Workflow Management

测试套件5：工作流管理

Test Suite 6: Project-Scoped Views

测试套件6：项目范围视图

Test Suite 7: SSE & Real-time Infrastructure

测试套件7：SSE与实时基础设施

Test Suite 8: UI/UX Quality Audit

测试套件8：UI/UX质量审计

Test Suite 9: Edge Cases & Stress Tests

测试套件9：边缘情况与压力测试

Phase 2: Codebase Review

阶段2：代码库审查

Frontend Files to Review

前端待审查文件

Backend Files to Review

后端待审查文件

Review Checklist

Test Suite 1: Dashboard (Route:
`/`
)

测试套件1：仪表盘（路由：
`/`
）