diagnose

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

MANDATORY PREPARATION

强制准备

Invoke {{command_prefix}}agent-workflow — it contains workflow principles, anti-patterns, and the Context Gathering Protocol. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run {{command_prefix}}teach-maestro first.

Perform a systematic diagnostic scan across 5 dimensions. For each dimension, score 1-5 and provide specific findings.
调用 {{command_prefix}}agent-workflow — 它包含工作流原则、反模式以及上下文收集协议。在继续操作前请遵循该协议,如果还不存在工作流上下文,你必须先运行{{command_prefix}}teach-maestro。

从5个维度执行系统诊断扫描,每个维度打1-5分并提供具体发现。

Dimension 1: Prompt Quality (1-5)

维度1:Prompt质量 (1-5)

Evaluate:
  • Structure (4-zone pattern: role, context, instructions, output)
  • Output schema definition (explicit vs. implicit)
  • Instruction clarity (specific vs. vague)
  • Edge case handling (addressed vs. ignored)
  • Anti-patterns present (wall of text, contradictions, implicit format)
评估项:
  • 结构(4区模式:角色、上下文、指令、输出)
  • 输出Schema定义(显式vs隐式)
  • 指令清晰度(明确vs模糊)
  • 边缘场景处理(已覆盖vs忽略)
  • 存在的反模式(大段文本、矛盾、隐式格式)

Dimension 2: Context Efficiency (1-5)

维度2:上下文效率 (1-5)

Evaluate:
  • Context budget allocation (planned vs. ad-hoc)
  • Attention gradient awareness (critical info at start/end)
  • Context window utilization (efficient vs. wasteful)
  • State management (explicit vs. implicit)
  • Memory strategy (appropriate for conversation length)
评估项:
  • 上下文预算分配(提前规划vs临时调整)
  • 注意力梯度感知(关键信息放置在开头/结尾)
  • 上下文窗口利用率(高效vs浪费)
  • 状态管理(显式vs隐式)
  • 内存策略(适配对话长度)

Dimension 3: Tool Health (1-5)

维度3:工具健康度 (1-5)

Evaluate:
  • Tool count (3-7 ideal, 13+ problematic)
  • Description quality (specific vs. vague)
  • Error handling (graceful vs. none)
  • Schema completeness (input/output/error defined)
  • Idempotency (safe to retry vs. side-effect prone)
  • Scope attribution: Distinguish between project-configured tools (e.g., custom scripts, project MCP servers) and agent-level tools (e.g., built-in IDE tools, global MCP servers). Only flag tool overhead for tools the project can actually control
评估项:
  • 工具数量(理想3-7个,13个以上存在问题)
  • 描述质量(明确vs模糊)
  • 错误处理(优雅降级vs无处理)
  • Schema完整性(输入/输出/错误已定义)
  • 幂等性(可安全重试vs易产生副作用)
  • 范围归属:区分项目配置的工具(例如自定义脚本、项目MCP服务器)和Agent级别的工具(例如内置IDE工具、全局MCP服务器),仅标记项目实际可控制的工具的开销

Dimension 4: Architecture Fitness (1-5)

维度4:架构适配性 (1-5)

Evaluate:
  • Topology appropriateness (single vs. multi-agent justified)
  • Agent boundaries (clear vs. overlapping)
  • Handoff protocols (structured vs. ad-hoc)
  • Observability (decisions logged vs. black box)
  • Cost awareness (budgeted vs. unbounded)
评估项:
  • 拓扑合理性(单Agent/多Agent选型合理)
  • Agent边界(清晰vs重叠)
  • 交接协议(结构化vs临时)
  • 可观测性(决策已记录vs黑盒)
  • 成本感知(有预算vs无上限)

Dimension 5: Safety & Reliability (1-5)

维度5:安全与可靠性 (1-5)

Evaluate:
  • Input validation (present vs. absent)
  • Output filtering (PII, content policy) — scope contextually: data flowing between a user's own frontend and backend (e.g., authenticated sessions, internal APIs) is lower risk than data exposed to external services or third-party APIs
  • Cost controls (ceilings set vs. unbounded)
  • Error recovery (fallbacks vs. crash)
  • Evaluation strategy (golden tests vs. "it seems to work")
评估项:
  • 输入校验(存在vs缺失)
  • 输出过滤(PII、内容政策)——结合上下文判断范围:用户自有前后端之间流动的数据(例如已认证会话、内部API)风险低于暴露给外部服务或第三方API的数据
  • 成本控制(已设置上限vs无上限)
  • 错误恢复(有降级方案vs直接崩溃)
  • 评估策略(黄金测试vs「看起来能用」)

Diagnostic Report Format

诊断报告格式

text
╔══════════════════════════════════════╗
║          MAESTRO DIAGNOSTIC         ║
╠══════════════════════════════════════╣
║ Prompt Quality      ████░  4/5      ║
║ Context Efficiency   ███░░  3/5      ║
║ Tool Health          ██░░░  2/5      ║
║ Architecture         ████░  4/5      ║
║ Safety & Reliability ██░░░  2/5      ║
╠══════════════════════════════════════╣
║ Overall Score:       15/25           ║
╚══════════════════════════════════════╝

CRITICAL FINDINGS:
1. [Most severe issue — immediate action needed]
2. [Second most severe]
3. [Third]

RECOMMENDED ACTIONS:
1. Run /fortify to add error handling (addresses Tool Health + Safety)
2. Run /streamline to reduce tool count (addresses Tool Health)
3. Run /refine for prompt structure improvements (addresses Prompt Quality)
text
╔══════════════════════════════════════╗
║          MAESTRO DIAGNOSTIC         ║
╠══════════════════════════════════════╣
║ Prompt Quality      ████░  4/5      ║
║ Context Efficiency   ███░░  3/5      ║
║ Tool Health          ██░░░  2/5      ║
║ Architecture         ████░  4/5      ║
║ Safety & Reliability ██░░░  2/5      ║
╠══════════════════════════════════════╣
║ Overall Score:       15/25           ║
╚══════════════════════════════════════╝

CRITICAL FINDINGS:
1. [Most severe issue — immediate action needed]
2. [Second most severe]
3. [Third]

RECOMMENDED ACTIONS:
1. Run /fortify to add error handling (addresses Tool Health + Safety)
2. Run /streamline to reduce tool count (addresses Tool Health)
3. Run /refine for prompt structure improvements (addresses Prompt Quality)

Maestro Command Mapping

Maestro命令映射

Every recommended action MUST reference the specific Maestro command that addresses it. Use this mapping:
Dimension GapMaestro CommandWhen to Recommend
Prompt structure, clarity, output schema
{{command_prefix}}refine
Score ≤ 4 on Prompt Quality
Context budget, attention gradient, memory
{{command_prefix}}streamline
Score ≤ 3 on Context Efficiency
Tool errors, missing tools, redundant tools
{{command_prefix}}fortify
Score ≤ 3 on Tool Health
Tool count reduction, unused tools
{{command_prefix}}streamline
Tool count > 7 or unused tools found
Safety gaps, error recovery, validation
{{command_prefix}}fortify
Score ≤ 3 on Safety & Reliability
Test coverage, golden tests, evaluation
{{command_prefix}}guard
No automated tests or evaluation strategy
Architecture boundaries, observability
{{command_prefix}}calibrate
Score ≤ 3 on Architecture Fitness
Do NOT give generic manual actions (e.g., "Add Vitest", "Create a rollback script") without also specifying which Maestro command the user should run to implement it. The recommended action format is:
Run
{{command_prefix}}<command>
to [specific action] (addresses [Dimension] #[gap number])
每个推荐操作必须引用对应的可解决问题的Maestro命令,使用以下映射关系:
维度差距Maestro命令推荐时机
Prompt结构、清晰度、输出Schema
{{command_prefix}}refine
Prompt质量得分 ≤4
上下文预算、注意力梯度、内存
{{command_prefix}}streamline
上下文效率得分 ≤3
工具错误、缺失工具、冗余工具
{{command_prefix}}fortify
工具健康度得分 ≤3
工具数量精简、未使用工具清理
{{command_prefix}}streamline
工具数量>7 或发现未使用的工具
安全漏洞、错误恢复、校验
{{command_prefix}}fortify
安全与可靠性得分 ≤3
测试覆盖率、黄金测试、评估
{{command_prefix}}guard
无自动化测试或评估策略
架构边界、可观测性
{{command_prefix}}calibrate
架构适配性得分 ≤3
请勿给出通用手动操作(例如「添加Vitest」、「创建回滚脚本」)而不指定用户应该运行哪个Maestro命令来实现该操作。推荐操作格式如下:
运行
{{command_prefix}}<command>
来[具体操作](解决[维度] #第几个差距)

Scoring Guide

评分指南

ScoreMeaningMaestro Action
5Production-excellentNo action needed
4Good with minor gaps
{{command_prefix}}refine
for polish
3Functional but risky
{{command_prefix}}fortify
or
{{command_prefix}}streamline
for targeted fix
2Significant issues
{{command_prefix}}fortify
+
{{command_prefix}}guard
— immediate attention
1Broken or missing
{{command_prefix}}onboard-agent
— rebuild required
分数含义Maestro操作
5生产级优秀无需操作
4良好,存在小幅差距运行
{{command_prefix}}refine
优化
3可运行但存在风险运行
{{command_prefix}}fortify
{{command_prefix}}streamline
针对性修复
2存在严重问题运行
{{command_prefix}}fortify
+
{{command_prefix}}guard
— 需立即处理
1不可用或缺失运行
{{command_prefix}}onboard-agent
— 需要重建

Diagnostic Checklist

诊断检查清单

  • All 5 dimensions scored with specific evidence
  • Critical findings listed in priority order
  • Each finding includes specific file/component location
  • Recommended actions reference specific Maestro commands (see Command Mapping above)
  • Overall score calculated and report generated
  • 所有5个维度都已打分并附带具体证据
  • 关键发现按优先级排序
  • 每个发现都包含具体的文件/组件位置
  • 推荐操作引用了具体的Maestro命令(参考上方命令映射)
  • 已计算总分并生成报告

Recommended Next Step

推荐下一步

After diagnosis, run the command mapped to your lowest-scoring dimension. For a general improvement sequence:
{{command_prefix}}fortify
{{command_prefix}}streamline
{{command_prefix}}refine
.
NEVER:
  • Give all 5s unless the workflow is genuinely production-excellent
  • Skip dimensions — score all 5 even if some seem fine
  • Diagnose without reading the actual workflow code/config
  • Recommend changes without specific findings to support them
  • Give generic manual actions without mapping them to a Maestro command
诊断完成后,运行对应最低分维度的命令。通用改进顺序为:
{{command_prefix}}fortify
{{command_prefix}}streamline
{{command_prefix}}refine
绝对禁止
  • 除非工作流确实达到生产级优秀标准,否则不要全打5分
  • 不要跳过维度,哪怕有些看起来没问题也要给全部5个维度打分
  • 未读取实际工作流代码/配置不要进行诊断
  • 没有具体发现支撑不要推荐修改
  • 不要给出未映射到Maestro命令的通用手动操作