dispatching-parallel-agents

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Dispatching Parallel Agents

调度并行Agent

Overview

概述

When you have multiple unrelated failures (different test files, different subsystems, different bugs), investigating them sequentially wastes time. Each investigation is independent and can happen in parallel.
Core principle: Dispatch one agent per independent problem domain. Let them work concurrently.
当你遇到多个不相关的故障(不同的测试文件、不同的子系统、不同的bug)时,按顺序排查会浪费时间。每项排查都是独立的,可以并行进行。
核心原则: 为每个独立的问题领域分配一个Agent,让它们同时工作。

When to Use

适用场景

dot
digraph when_to_use {
    "Multiple failures?" [shape=diamond];
    "Are they independent?" [shape=diamond];
    "Single agent investigates all" [shape=box];
    "One agent per problem domain" [shape=box];
    "Can they work in parallel?" [shape=diamond];
    "Sequential agents" [shape=box];
    "Parallel dispatch" [shape=box];

    "Multiple failures?" -> "Are they independent?" [label="yes"];
    "Are they independent?" -> "Single agent investigates all" [label="no - related"];
    "Are they independent?" -> "Can they work in parallel?" [label="yes"];
    "Can they work in parallel?" -> "Parallel dispatch" [label="yes"];
    "Can they work in parallel?" -> "Sequential agents" [label="no - shared state"];
}
Use when:
  • 3+ test files failing with different root causes
  • Multiple subsystems broken independently
  • Each problem can be understood without context from others
  • No shared state between investigations
Don't use when:
  • Failures are related (fix one might fix others)
  • Need to understand full system state
  • Agents would interfere with each other
dot
digraph when_to_use {
    "Multiple failures?" [shape=diamond];
    "Are they independent?" [shape=diamond];
    "Single agent investigates all" [shape=box];
    "One agent per problem domain" [shape=box];
    "Can they work in parallel?" [shape=diamond];
    "Sequential agents" [shape=box];
    "Parallel dispatch" [shape=box];

    "Multiple failures?" -> "Are they independent?" [label="yes"];
    "Are they independent?" -> "Single agent investigates all" [label="no - related"];
    "Are they independent?" -> "Can they work in parallel?" [label="yes"];
    "Can they work in parallel?" -> "Parallel dispatch" [label="yes"];
    "Can they work in parallel?" -> "Sequential agents" [label="no - shared state"];
}
适用场景:
  • 3个及以上测试文件因不同根因失败
  • 多个子系统独立出现故障
  • 每个问题无需其他问题的上下文即可理解
  • 排查任务之间无共享状态
不适用场景:
  • 故障之间存在关联(修复一个可能解决其他故障)
  • 需要了解系统完整状态
  • Agent之间会互相干扰

The Pattern

实施模式

1. Identify Independent Domains

1. 识别独立领域

Group failures by what's broken:
  • File A tests: Tool approval flow
  • File B tests: Batch completion behavior
  • File C tests: Abort functionality
Each domain is independent - fixing tool approval doesn't affect abort tests.
按故障类型分组:
  • 文件A测试:工具审批流程
  • 文件B测试:批处理完成行为
  • 文件C测试:中止功能
每个领域都是独立的——修复工具审批流程不会影响中止测试。

2. Create Focused Agent Tasks

2. 创建聚焦的Agent任务

Each agent gets:
  • Specific scope: One test file or subsystem
  • Clear goal: Make these tests pass
  • Constraints: Don't change other code
  • Expected output: Summary of what you found and fixed
为每个Agent分配:
  • 明确范围: 一个测试文件或子系统
  • 清晰目标: 让这些测试用例通过
  • 约束条件: 不要修改其他代码
  • 预期输出: 发现问题及修复内容的总结

3. Dispatch in Parallel

3. 并行调度

typescript
// In Claude Code / AI environment
Task("Fix agent-tool-abort.test.ts failures")
Task("Fix batch-completion-behavior.test.ts failures")
Task("Fix tool-approval-race-conditions.test.ts failures")
// All three run concurrently
typescript
// In Claude Code / AI environment
Task("Fix agent-tool-abort.test.ts failures")
Task("Fix batch-completion-behavior.test.ts failures")
Task("Fix tool-approval-race-conditions.test.ts failures")
// All three run concurrently

4. Review and Integrate

4. 审核与集成

When agents return:
  • Read each summary
  • Verify fixes don't conflict
  • Run full test suite
  • Integrate all changes
当Agent返回结果后:
  • 阅读每个总结
  • 验证修复内容无冲突
  • 运行完整测试套件
  • 集成所有更改

Agent Prompt Structure

Agent提示词结构

Good agent prompts are:
  1. Focused - One clear problem domain
  2. Self-contained - All context needed to understand the problem
  3. Specific about output - What should the agent return?
markdown
Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts:

1. "should abort tool with partial output capture" - expects 'interrupted at' in message
2. "should handle mixed completed and aborted tools" - fast tool aborted instead of completed
3. "should properly track pendingToolCount" - expects 3 results but gets 0

These are timing/race condition issues. Your task:

1. Read the test file and understand what each test verifies
2. Identify root cause - timing issues or actual bugs?
3. Fix by:
   - Replacing arbitrary timeouts with event-based waiting
   - Fixing bugs in abort implementation if found
   - Adjusting test expectations if testing changed behavior

Do NOT just increase timeouts - find the real issue.

Return: Summary of what you found and what you fixed.
优秀的Agent提示词具备以下特点:
  1. 聚焦性 - 只针对一个清晰的问题领域
  2. 自包含 - 包含理解问题所需的全部上下文
  3. 输出明确 - 明确Agent需要返回什么内容
markdown
Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts:

1. "should abort tool with partial output capture" - expects 'interrupted at' in message
2. "should handle mixed completed and aborted tools" - fast tool aborted instead of completed
3. "should properly track pendingToolCount" - expects 3 results but gets 0

These are timing/race condition issues. Your task:

1. Read the test file and understand what each test verifies
2. Identify root cause - timing issues or actual bugs?
3. Fix by:
   - Replacing arbitrary timeouts with event-based waiting
   - Fixing bugs in abort implementation if found
   - Adjusting test expectations if testing changed behavior

Do NOT just increase timeouts - find the real issue.

Return: Summary of what you found and what you fixed.

Common Mistakes

常见错误

❌ Too broad: "Fix all the tests" - agent gets lost ✅ Specific: "Fix agent-tool-abort.test.ts" - focused scope
❌ No context: "Fix the race condition" - agent doesn't know where ✅ Context: Paste the error messages and test names
❌ No constraints: Agent might refactor everything ✅ Constraints: "Do NOT change production code" or "Fix tests only"
❌ Vague output: "Fix it" - you don't know what changed ✅ Specific: "Return summary of root cause and changes"
❌ 范围过宽: "修复所有测试用例"——Agent会迷失方向 ✅ 具体明确: "修复agent-tool-abort.test.ts"——范围聚焦
❌ 缺少上下文: "修复竞态条件"——Agent不知道位置 ✅ 提供上下文: 粘贴错误信息和测试用例名称
❌ 无约束条件: Agent可能重构所有内容 ✅ 明确约束: "不要修改生产代码"或"仅修复测试用例"
❌ 输出模糊: "修复它"——你不知道具体更改了什么 ✅ 输出明确: "返回根因及修改内容的总结"

When NOT to Use

不适用场景

Related failures: Fixing one might fix others - investigate together first Need full context: Understanding requires seeing entire system Exploratory debugging: You don't know what's broken yet Shared state: Agents would interfere (editing same files, using same resources)
故障相关联: 修复一个可能解决其他故障——先一起排查 需要完整上下文: 理解问题需要了解整个系统 探索性调试: 你还不知道具体哪里出了问题 共享状态: Agent之间会互相干扰(编辑同一文件、使用同一资源)

Real Example from Session

实际会话示例

Scenario: 6 test failures across 3 files after major refactoring
Failures:
  • agent-tool-abort.test.ts: 3 failures (timing issues)
  • batch-completion-behavior.test.ts: 2 failures (tools not executing)
  • tool-approval-race-conditions.test.ts: 1 failure (execution count = 0)
Decision: Independent domains - abort logic separate from batch completion separate from race conditions
Dispatch:
Agent 1 → Fix agent-tool-abort.test.ts
Agent 2 → Fix batch-completion-behavior.test.ts
Agent 3 → Fix tool-approval-race-conditions.test.ts
Results:
  • Agent 1: Replaced timeouts with event-based waiting
  • Agent 2: Fixed event structure bug (threadId in wrong place)
  • Agent 3: Added wait for async tool execution to complete
Integration: All fixes independent, no conflicts, full suite green
Time saved: 3 problems solved in parallel vs sequentially
场景: 重大重构后,3个文件中出现6个测试用例失败
故障详情:
  • agent-tool-abort.test.ts:3个失败(时序问题)
  • batch-completion-behavior.test.ts:2个失败(工具未执行)
  • tool-approval-race-conditions.test.ts:1个失败(执行次数=0)
决策: 各领域独立——中止逻辑、批处理完成逻辑、竞态条件互不相关
调度:
Agent 1 → Fix agent-tool-abort.test.ts
Agent 2 → Fix batch-completion-behavior.test.ts
Agent 3 → Fix tool-approval-race-conditions.test.ts
结果:
  • Agent 1:用基于事件的等待替换了任意超时
  • Agent 2:修复了事件结构bug(threadId位置错误)
  • Agent 3:添加了异步工具执行完成的等待逻辑
集成: 所有修复内容独立,无冲突,完整测试套件执行通过
时间节省: 并行解决3个问题,耗时与解决1个问题相当

Key Benefits

核心优势

  1. Parallelization - Multiple investigations happen simultaneously
  2. Focus - Each agent has narrow scope, less context to track
  3. Independence - Agents don't interfere with each other
  4. Speed - 3 problems solved in time of 1
  1. 并行化 - 多项排查任务同时进行
  2. 聚焦性 - 每个Agent的范围狭窄,无需跟踪过多上下文
  3. 独立性 - Agent之间不会互相干扰
  4. 高效性 - 3个问题的解决时间等同于1个问题的解决时间

Verification

验证步骤

After agents return:
  1. Review each summary - Understand what changed
  2. Check for conflicts - Did agents edit same code?
  3. Run full suite - Verify all fixes work together
  4. Spot check - Agents can make systematic errors
Agent返回结果后:
  1. 审核每个总结 - 了解具体更改内容
  2. 检查冲突 - Agent是否编辑了同一代码?
  3. 运行完整测试套件 - 验证所有修复内容协同工作
  4. 抽查验证 - Agent可能会出现系统性错误

Real-World Impact

实际业务影响

From debugging session (2025-10-03):
  • 6 failures across 3 files
  • 3 agents dispatched in parallel
  • All investigations completed concurrently
  • All fixes integrated successfully
  • Zero conflicts between agent changes
来自2025-10-03的调试会话:
  • 3个文件中出现6个测试用例失败
  • 并行调度3个Agent
  • 所有排查任务同时完成
  • 所有修复内容集成成功
  • Agent的更改之间无任何冲突