agentic-quality-engineering

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Agentic Quality Engineering

Agent式质量工程

<default_to_action> When implementing agentic QE or coordinating agents:
  1. SPAWN appropriate agent(s) for the task using
    Task
    tool with agent type
  2. CONFIGURE agent coordination (hierarchical/mesh/sequential)
  3. EXECUTE with PACT principles: Proactive analysis, Autonomous operation, Collaborative feedback, Targeted risk focus
  4. VALIDATE results through quality gates before deployment
  5. LEARN from outcomes - store patterns in
    aqe/learning/*
    namespace
Quick Agent Selection:
  • Test generation needed →
    qe-test-generator
  • Coverage gaps →
    qe-coverage-analyzer
  • Quality decision →
    qe-quality-gate
  • Security scan →
    qe-security-scanner
  • Performance test →
    qe-performance-tester
  • Full pipeline →
    qe-fleet-commander
Critical Success Factors:
  • Agents amplify human expertise, not replace it
  • Human-in-the-loop for critical decisions
  • Measure: bugs caught, time saved, coverage improved </default_to_action>
<default_to_action> 在实施Agent式QE或协调Agent时:
  1. 使用
    Task
    工具并指定Agent类型,生成(SPAWN)适合任务的Agent
  2. 配置(CONFIGURE)Agent协调方式(层级式/网状/顺序式)
  3. 遵循PACT原则执行(EXECUTE):主动分析、自主操作、协作反馈、针对性风险聚焦
  4. 部署前通过质量门验证(VALIDATE)结果
  5. 从结果中学习(LEARN)——将模式存储在
    aqe/learning/*
    命名空间中
快速Agent选择:
  • 需要生成测试 →
    qe-test-generator
  • 覆盖缺口分析 →
    qe-coverage-analyzer
  • 质量决策 →
    qe-quality-gate
  • 安全扫描 →
    qe-security-scanner
  • 性能测试 →
    qe-performance-tester
  • 完整流水线 →
    qe-fleet-commander
关键成功因素:
  • Agent是对人类专业能力的放大,而非替代
  • 关键决策需保留人工介入环节
  • 衡量指标:发现的Bug数量、节省的时间、测试覆盖率提升情况 </default_to_action>

Quick Reference Card

快速参考卡片

When to Use

适用场景

  • Designing autonomous testing systems
  • Scaling QE with intelligent agents
  • Implementing multi-agent coordination
  • Building CI/CD quality pipelines
  • 设计自主测试系统
  • 借助智能Agent扩展QE规模
  • 实施多Agent协调
  • 构建CI/CD质量流水线

PACT Principles

PACT原则

PrincipleAgent BehaviorHuman Role
ProactiveAnalyze pre-merge, predict riskSet guardrails
AutonomousExecute tests, fix flaky testsReview critical
CollaborativeMulti-agent coordinationProvide context
TargetedRisk-based prioritizationDefine risk areas
原则Agent行为人类角色
Proactive(主动)分析合并前内容,预测风险设置管控边界
Autonomous(自主)执行测试,修复不稳定测试审核关键结果
Collaborative(协作)多Agent协同工作提供上下文信息
Targeted(针对性)基于风险优先级排序定义风险领域

19-Agent Fleet

19-Agent集群

CategoryAgentsPrimary Use
Core Testing (5)test-generator, test-executor, coverage-analyzer, quality-gate, quality-analyzerDaily testing
Performance/Security (2)performance-tester, security-scannerNon-functional
Strategic (3)requirements-validator, production-intelligence, fleet-commanderPlanning
Advanced (4)regression-risk-analyzer, test-data-architect, api-contract-validator, flaky-test-hunterSpecialized
Visual/Chaos (2)visual-tester, chaos-engineerEdge cases
Deployment (1)deployment-readinessRelease
Analysis (1)code-complexityMaintainability
类别Agent列表主要用途
核心测试类(5个)test-generator, test-executor, coverage-analyzer, quality-gate, quality-analyzer日常测试
性能/安全类(2个)performance-tester, security-scanner非功能性测试
战略类(3个)requirements-validator, production-intelligence, fleet-commander规划阶段
进阶类(4个)regression-risk-analyzer, test-data-architect, api-contract-validator, flaky-test-hunter专项测试
可视化/混沌测试类(2个)visual-tester, chaos-engineer边缘场景测试
部署类(1个)deployment-readiness发布阶段
分析类(1个)code-complexity可维护性分析

Coordination Patterns

协调模式

Hierarchical: fleet-commander → [generators] → [executors] → quality-gate
Mesh: test-gen ↔ coverage ↔ quality (peer decisions)
Sequential: risk-analyzer → test-gen → executor → coverage → gate
Hierarchical: fleet-commander → [generators] → [executors] → quality-gate
Mesh: test-gen ↔ coverage ↔ quality (peer decisions)
Sequential: risk-analyzer → test-gen → executor → coverage → gate

Success Criteria

成功标准

✅ 10x deployment frequency with same/better quality ✅ Coverage gaps detected in real-time ✅ Bugs caught pre-production ❌ Agents acting without human oversight on critical decisions ❌ Deploying all 19 agents at once (start with 1-2)

✅ 部署频率提升10倍,且质量保持不变或更优 ✅ 实时检测到测试覆盖缺口 ✅ 生产前发现Bug ❌ Agent在无人工监督的情况下处理关键决策 ❌ 一次性部署全部19个Agent(建议从1-2个开始)

Core Concepts

核心概念

QE Evolution

QE演进历程

StageApproachLimitation
TraditionalManual everythingHuman bottleneck
AutomationScripts + fixed scenariosNeeds orchestration
AgenticAI agents + human judgmentRequires trust-building
Core Premise: Agents amplify human expertise for 10x scale.
阶段方式局限性
传统阶段全手动操作人力瓶颈
自动化阶段脚本+固定场景需要编排管理
Agent驱动阶段AI Agent+人工判断需要建立信任机制
核心前提: Agent可将人类专业能力放大10倍,实现规模化。

Key Capabilities

核心能力

1. Intelligent Test Generation
typescript
// Agent analyzes code change, generates targeted tests
const tests = await qeTestGenerator.generate(prDiff);
// → Happy path, edge cases, error handling tests
2. Pattern Detection - Scan logs, find anomalies, correlate errors
3. Adaptive Strategy - Adjust test focus based on risk signals
4. Root Cause Analysis - Link failures to code changes, suggest fixes

1. 智能测试生成
typescript
// Agent分析代码变更,生成针对性测试
const tests = await qeTestGenerator.generate(prDiff);
// → 正常流程、边缘场景、错误处理测试
2. 模式检测 - 扫描日志,发现异常,关联错误
3. 自适应策略 - 根据风险信号调整测试重点
4. 根因分析 - 将故障与代码变更关联,建议修复方案

Agent Coordination

Agent协调

Memory Namespaces

内存命名空间

aqe/test-plan/*     - Test planning decisions
aqe/coverage/*      - Coverage analysis results
aqe/quality/*       - Quality metrics and gates
aqe/learning/*      - Patterns and Q-values
aqe/coordination/*  - Cross-agent state
aqe/test-plan/*     - 测试规划决策
aqe/coverage/*      - 覆盖分析结果
aqe/quality/*       - 质量指标与质量门
aqe/learning/*      - 模式与Q值
aqe/coordination/*  - 跨Agent状态

Memory Operations (MCP Tools)

内存操作(MCP工具)

CRITICAL: Always use
mcp__agentic-qe__memory_store
with
persist: true
for learnings.
1. Store data to persistent memory:
javascript
// Store test plan decisions (persisted to .agentic-qe/memory.db)
mcp__agentic-qe__memory_store({
  key: "aqe/test-plan/pr-123",
  namespace: "aqe/test-plan",
  value: {
    prNumber: 123,
    riskLevel: "medium",
    requiredCoverage: 85,
    testTypes: ["unit", "integration"],
    estimatedTime: 1800
  },
  persist: true,  // ⚠️ REQUIRED for cross-session persistence
  ttl: 604800     // 7 days (0 = permanent)
})
2. Retrieve prior learnings before task:
javascript
// Query patterns before starting test generation
const priorData = await mcp__agentic-qe__memory_retrieve({
  key: "aqe/learning/patterns/test-generation/*",
  namespace: "aqe/learning",
  includeMetadata: true
})

// Use patterns to guide current task
if (priorData.success) {
  console.log(`Loaded ${priorData.patterns.length} prior patterns`);
}
3. Store coverage analysis results:
javascript
mcp__agentic-qe__memory_store({
  key: "aqe/coverage/auth-module",
  namespace: "aqe/coverage",
  value: {
    moduleId: "auth-module",
    currentCoverage: 78,
    gaps: ["error-handling", "edge-cases"],
    suggestedTests: 12,
    priority: "high"
  },
  persist: true,
  ttl: 1209600  // 14 days
})
重要提示:对于学习到的模式,务必使用
mcp__agentic-qe__memory_store
并设置
persist: true
1. 将数据存储到持久化内存:
javascript
// 存储测试规划决策(持久化到.agentic-qe/memory.db)
mcp__agentic-qe__memory_store({
  key: "aqe/test-plan/pr-123",
  namespace: "aqe/test-plan",
  value: {
    prNumber: 123,
    riskLevel: "medium",
    requiredCoverage: 85,
    testTypes: ["unit", "integration"],
    estimatedTime: 1800
  },
  persist: true,  // ⚠️ 跨会话持久化必填
  ttl: 604800     // 7天(0表示永久)
})
2. 任务开始前获取历史学习数据:
javascript
// 生成测试前查询历史模式
const priorData = await mcp__agentic-qe__memory_retrieve({
  key: "aqe/learning/patterns/test-generation/*",
  namespace: "aqe/learning",
  includeMetadata: true
})

// 使用历史模式指导当前任务
if (priorData.success) {
  console.log(`已加载${priorData.patterns.length}个历史模式`);
}
3. 存储覆盖分析结果:
javascript
mcp__agentic-qe__memory_store({
  key: "aqe/coverage/auth-module",
  namespace: "aqe/coverage",
  value: {
    moduleId: "auth-module",
    currentCoverage: 78,
    gaps: ["error-handling", "edge-cases"],
    suggestedTests: 12,
    priority: "high"
  },
  persist: true,
  ttl: 1209600  // 14天
})

Three-Phase Memory Protocol

三阶段内存协议

For coordinated multi-agent tasks, use the STATUS → PROGRESS → COMPLETE pattern:
javascript
// PHASE 1: STATUS - Task starting
mcp__agentic-qe__memory_store({
  key: "aqe/coordination/task-123/status",
  namespace: "aqe/coordination",
  value: { status: "running", agent: "qe-test-generator", startTime: Date.now() },
  persist: true
})

// PHASE 2: PROGRESS - Intermediate updates
mcp__agentic-qe__memory_store({
  key: "aqe/coordination/task-123/progress",
  namespace: "aqe/coordination",
  value: { progress: 50, action: "generating-unit-tests", testsGenerated: 25 },
  persist: true
})

// PHASE 3: COMPLETE - Task finished
mcp__agentic-qe__memory_store({
  key: "aqe/coordination/task-123/complete",
  namespace: "aqe/coordination",
  value: {
    status: "complete",
    result: "success",
    testsGenerated: 47,
    coverageAchieved: 92.3,
    duration: 15000
  },
  persist: true
})
对于多Agent协同任务,使用STATUS → PROGRESS → COMPLETE模式:
javascript
// 阶段1:STATUS - 任务启动
mcp__agentic-qe__memory_store({
  key: "aqe/coordination/task-123/status",
  namespace: "aqe/coordination",
  value: { status: "running", agent: "qe-test-generator", startTime: Date.now() },
  persist: true
})

// 阶段2:PROGRESS - 中间更新
mcp__agentic-qe__memory_store({
  key: "aqe/coordination/task-123/progress",
  namespace: "aqe/coordination",
  value: { progress: 50, action: "generating-unit-tests", testsGenerated: 25 },
  persist: true
})

// 阶段3:COMPLETE - 任务完成
mcp__agentic-qe__memory_store({
  key: "aqe/coordination/task-123/complete",
  namespace: "aqe/coordination",
  value: {
    status: "complete",
    result: "success",
    testsGenerated: 47,
    coverageAchieved: 92.3,
    duration: 15000
  },
  persist: true
})

Blackboard Events

黑板事件

EventTriggerSubscribers
test:generated
New tests createdexecutor, coverage
coverage:gap
Gap detectedtest-generator
quality:decision
Gate evaluatedfleet-commander
security:finding
Vulnerability foundquality-gate
事件触发条件订阅者
test:generated
生成新测试executor, coverage
coverage:gap
检测到覆盖缺口test-generator
quality:decision
质量门评估完成fleet-commander
security:finding
发现漏洞quality-gate

Example: PR Quality Pipeline

示例:PR质量流水线

typescript
// 1. Risk analysis
const risks = await Task("Analyze PR", prDiff, "qe-regression-risk-analyzer");

// 2. Generate tests for risks
const tests = await Task("Generate tests", risks, "qe-test-generator");

// 3. Execute + analyze
const results = await Task("Run tests", tests, "qe-test-executor");
const coverage = await Task("Check coverage", results, "qe-coverage-analyzer");

// 4. Quality decision
const decision = await Task("Evaluate", {results, coverage}, "qe-quality-gate");
// → GO/NO-GO with rationale

typescript
// 1. 风险分析
const risks = await Task("Analyze PR", prDiff, "qe-regression-risk-analyzer");

// 2. 针对风险生成测试
const tests = await Task("Generate tests", risks, "qe-test-generator");

// 3. 执行+分析
const results = await Task("Run tests", tests, "qe-test-executor");
const coverage = await Task("Check coverage", results, "qe-coverage-analyzer");

// 4. 质量决策
const decision = await Task("Evaluate", {results, coverage}, "qe-quality-gate");
// → 给出通过/不通过的决策及理由

Implementation Phases

实施阶段

PhaseDurationGoalAgent(s)
ExperimentWeeks 1-4Validate one use case1 agent
IntegrateMonths 2-3CI/CD pipeline3-4 agents
ScaleMonths 4-6Multiple use cases8+ agents
EvolveOngoingContinuous learningFull fleet
阶段时长目标使用的Agent
实验阶段第1-4周验证单个用例1个Agent
集成阶段第2-3个月接入CI/CD流水线3-4个Agent
规模化阶段第4-6个月覆盖多个用例8个以上Agent
演进阶段持续进行持续学习优化全部Agent

Phase 1 Example

阶段1示例

bash
undefined
bash
undefined

Week 1: Deploy single agent

第1周:部署单个Agent

aqe agent spawn qe-test-generator
aqe agent spawn qe-test-generator

Weeks 2-3: Generate tests for 10 PRs

第2-3周:为10个PR生成测试

Track: bugs found, test quality, review time

跟踪指标:发现的Bug数量、测试质量、评审时间

Week 4: Measure impact

第4周:衡量影响

aqe agent metrics qe-test-generator
aqe agent metrics qe-test-generator

→ Tests: 150, Bugs: 12, Time saved: 8h

→ 生成测试:150个,发现Bug:12个,节省时间:8小时


---

---

Limitations & Strengths

局限性与优势

Agents Excel At

Agent擅长的场景

  • Volume: Scan thousands of logs in seconds
  • Patterns: Find correlations humans miss
  • Tireless: 24/7 testing and monitoring
  • Speed: Instant code change analysis
  • 批量处理:几秒内扫描数千条日志
  • 模式识别:发现人类难以察觉的关联
  • 持续工作:7×24小时测试与监控
  • 快速响应:即时分析代码变更

Agents Need Humans For

Agent需要人类参与的场景

  • Business context and priorities
  • Ethical judgment and trade-offs
  • Creative exploration ("what if" scenarios)
  • Domain expertise (healthcare, finance, legal)

  • 业务上下文与优先级定义
  • 伦理判断与权衡决策
  • 创造性探索(“假设”场景)
  • 领域专业知识(医疗、金融、法律等)

Best Practices

最佳实践

DoDon't
Start with one agent, one use caseDeploy all 18 at once
Build feedback loops earlyDeploy and forget
Human reviews agent outputAuto-merge without review
Measure bugs caught, time savedTrack vanity metrics (test count)
Build trust graduallyGive full autonomy immediately
建议避免
从单个Agent、单个用例开始一次性部署全部18个Agent
尽早建立反馈循环部署后不再管
人工审核Agent输出结果无需审核直接合并
衡量发现的Bug数量、节省的时间关注 vanity metrics(如测试数量)
逐步建立信任立即赋予完全自主权

Trust Progression

信任演进路径

Month 1: Agent suggests → Human decides
Month 2: Agent acts → Human reviews after
Month 3: Agent autonomous on low-risk
Month 4: Agent handles critical with oversight

第1个月:Agent提出建议 → 人类做决策
第2个月:Agent执行操作 → 人类事后审核
第3个月:Agent自主处理低风险任务
第4个月:Agent在监督下处理关键任务

Agent Coordination Hints

Agent协调提示

yaml
coordination:
  topology: hierarchical
  commander: qe-fleet-commander
  memory_namespace: aqe/coordination
  blackboard_topic: qe-fleet

preload_skills:
  - agentic-quality-engineering  # Always (this skill)
  - risk-based-testing           # For prioritization
  - quality-metrics              # For measurement

agent_assignments:
  qe-test-generator: [api-testing-patterns, tdd-london-chicago]
  qe-coverage-analyzer: [quality-metrics, risk-based-testing]
  qe-security-scanner: [security-testing, risk-based-testing]
  qe-performance-tester: [performance-testing]

yaml
coordination:
  topology: hierarchical
  commander: qe-fleet-commander
  memory_namespace: aqe/coordination
  blackboard_topic: qe-fleet

preload_skills:
  - agentic-quality-engineering  # 必须加载(本技能)
  - risk-based-testing           # 用于优先级排序
  - quality-metrics              # 用于效果衡量

agent_assignments:
  qe-test-generator: [api-testing-patterns, tdd-london-chicago]
  qe-coverage-analyzer: [quality-metrics, risk-based-testing]
  qe-security-scanner: [security-testing, risk-based-testing]
  qe-performance-tester: [performance-testing]

Related Skills

相关技能

  • holistic-testing-pact
    - PACT principles deep dive
  • risk-based-testing
    - Prioritize agent focus
  • quality-metrics
    - Measure agent effectiveness
  • api-testing-patterns
    ,
    security-testing
    ,
    performance-testing
    - Specialized testing
  • holistic-testing-pact
    - PACT原则深度解析
  • risk-based-testing
    - 定义Agent测试重点
  • quality-metrics
    - 衡量Agent效果
  • api-testing-patterns
    ,
    security-testing
    ,
    performance-testing
    - 专项测试技能

Resources

资源

  • Agent definitions:
    .claude/agents/
  • CLI:
    aqe agent --help
  • Fleet status:
    aqe fleet status

Success Metric: Deploy 10x more frequently with same or better quality through intelligent agent collaboration.
  • Agent定义:
    .claude/agents/
  • 命令行工具:
    aqe agent --help
  • 集群状态:
    aqe fleet status

成功指标: 通过智能Agent协作,实现部署频率提升10倍,且质量保持不变或更优。