diagnose
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMANDATORY PREPARATION
强制准备
Invoke {{command_prefix}}agent-workflow — it contains workflow principles, anti-patterns, and the Context Gathering Protocol. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run {{command_prefix}}teach-maestro first.
Perform a systematic diagnostic scan across 5 dimensions. For each dimension, score 1-5 and provide specific findings.
调用 {{command_prefix}}agent-workflow — 它包含工作流原则、反模式以及上下文收集协议。在继续操作前请遵循该协议,如果还不存在工作流上下文,你必须先运行{{command_prefix}}teach-maestro。
从5个维度执行系统诊断扫描,每个维度打1-5分并提供具体发现。
Dimension 1: Prompt Quality (1-5)
维度1:Prompt质量 (1-5)
Evaluate:
- Structure (4-zone pattern: role, context, instructions, output)
- Output schema definition (explicit vs. implicit)
- Instruction clarity (specific vs. vague)
- Edge case handling (addressed vs. ignored)
- Anti-patterns present (wall of text, contradictions, implicit format)
评估项:
- 结构(4区模式:角色、上下文、指令、输出)
- 输出Schema定义(显式vs隐式)
- 指令清晰度(明确vs模糊)
- 边缘场景处理(已覆盖vs忽略)
- 存在的反模式(大段文本、矛盾、隐式格式)
Dimension 2: Context Efficiency (1-5)
维度2:上下文效率 (1-5)
Evaluate:
- Context budget allocation (planned vs. ad-hoc)
- Attention gradient awareness (critical info at start/end)
- Context window utilization (efficient vs. wasteful)
- State management (explicit vs. implicit)
- Memory strategy (appropriate for conversation length)
评估项:
- 上下文预算分配(提前规划vs临时调整)
- 注意力梯度感知(关键信息放置在开头/结尾)
- 上下文窗口利用率(高效vs浪费)
- 状态管理(显式vs隐式)
- 内存策略(适配对话长度)
Dimension 3: Tool Health (1-5)
维度3:工具健康度 (1-5)
Evaluate:
- Tool count (3-7 ideal, 13+ problematic)
- Description quality (specific vs. vague)
- Error handling (graceful vs. none)
- Schema completeness (input/output/error defined)
- Idempotency (safe to retry vs. side-effect prone)
- Scope attribution: Distinguish between project-configured tools (e.g., custom scripts, project MCP servers) and agent-level tools (e.g., built-in IDE tools, global MCP servers). Only flag tool overhead for tools the project can actually control
评估项:
- 工具数量(理想3-7个,13个以上存在问题)
- 描述质量(明确vs模糊)
- 错误处理(优雅降级vs无处理)
- Schema完整性(输入/输出/错误已定义)
- 幂等性(可安全重试vs易产生副作用)
- 范围归属:区分项目配置的工具(例如自定义脚本、项目MCP服务器)和Agent级别的工具(例如内置IDE工具、全局MCP服务器),仅标记项目实际可控制的工具的开销
Dimension 4: Architecture Fitness (1-5)
维度4:架构适配性 (1-5)
Evaluate:
- Topology appropriateness (single vs. multi-agent justified)
- Agent boundaries (clear vs. overlapping)
- Handoff protocols (structured vs. ad-hoc)
- Observability (decisions logged vs. black box)
- Cost awareness (budgeted vs. unbounded)
评估项:
- 拓扑合理性(单Agent/多Agent选型合理)
- Agent边界(清晰vs重叠)
- 交接协议(结构化vs临时)
- 可观测性(决策已记录vs黑盒)
- 成本感知(有预算vs无上限)
Dimension 5: Safety & Reliability (1-5)
维度5:安全与可靠性 (1-5)
Evaluate:
- Input validation (present vs. absent)
- Output filtering (PII, content policy) — scope contextually: data flowing between a user's own frontend and backend (e.g., authenticated sessions, internal APIs) is lower risk than data exposed to external services or third-party APIs
- Cost controls (ceilings set vs. unbounded)
- Error recovery (fallbacks vs. crash)
- Evaluation strategy (golden tests vs. "it seems to work")
评估项:
- 输入校验(存在vs缺失)
- 输出过滤(PII、内容政策)——结合上下文判断范围:用户自有前后端之间流动的数据(例如已认证会话、内部API)风险低于暴露给外部服务或第三方API的数据
- 成本控制(已设置上限vs无上限)
- 错误恢复(有降级方案vs直接崩溃)
- 评估策略(黄金测试vs「看起来能用」)
Diagnostic Report Format
诊断报告格式
text
╔══════════════════════════════════════╗
║ MAESTRO DIAGNOSTIC ║
╠══════════════════════════════════════╣
║ Prompt Quality ████░ 4/5 ║
║ Context Efficiency ███░░ 3/5 ║
║ Tool Health ██░░░ 2/5 ║
║ Architecture ████░ 4/5 ║
║ Safety & Reliability ██░░░ 2/5 ║
╠══════════════════════════════════════╣
║ Overall Score: 15/25 ║
╚══════════════════════════════════════╝
CRITICAL FINDINGS:
1. [Most severe issue — immediate action needed]
2. [Second most severe]
3. [Third]
RECOMMENDED ACTIONS:
1. Run /fortify to add error handling (addresses Tool Health + Safety)
2. Run /streamline to reduce tool count (addresses Tool Health)
3. Run /refine for prompt structure improvements (addresses Prompt Quality)text
╔══════════════════════════════════════╗
║ MAESTRO DIAGNOSTIC ║
╠══════════════════════════════════════╣
║ Prompt Quality ████░ 4/5 ║
║ Context Efficiency ███░░ 3/5 ║
║ Tool Health ██░░░ 2/5 ║
║ Architecture ████░ 4/5 ║
║ Safety & Reliability ██░░░ 2/5 ║
╠══════════════════════════════════════╣
║ Overall Score: 15/25 ║
╚══════════════════════════════════════╝
CRITICAL FINDINGS:
1. [Most severe issue — immediate action needed]
2. [Second most severe]
3. [Third]
RECOMMENDED ACTIONS:
1. Run /fortify to add error handling (addresses Tool Health + Safety)
2. Run /streamline to reduce tool count (addresses Tool Health)
3. Run /refine for prompt structure improvements (addresses Prompt Quality)Maestro Command Mapping
Maestro命令映射
Every recommended action MUST reference the specific Maestro command that addresses it. Use this mapping:
| Dimension Gap | Maestro Command | When to Recommend |
|---|---|---|
| Prompt structure, clarity, output schema | | Score ≤ 4 on Prompt Quality |
| Context budget, attention gradient, memory | | Score ≤ 3 on Context Efficiency |
| Tool errors, missing tools, redundant tools | | Score ≤ 3 on Tool Health |
| Tool count reduction, unused tools | | Tool count > 7 or unused tools found |
| Safety gaps, error recovery, validation | | Score ≤ 3 on Safety & Reliability |
| Test coverage, golden tests, evaluation | | No automated tests or evaluation strategy |
| Architecture boundaries, observability | | Score ≤ 3 on Architecture Fitness |
Do NOT give generic manual actions (e.g., "Add Vitest", "Create a rollback script") without also specifying which Maestro command the user should run to implement it. The recommended action format is:
Runto [specific action] (addresses [Dimension] #[gap number]){{command_prefix}}<command>
每个推荐操作必须引用对应的可解决问题的Maestro命令,使用以下映射关系:
| 维度差距 | Maestro命令 | 推荐时机 |
|---|---|---|
| Prompt结构、清晰度、输出Schema | | Prompt质量得分 ≤4 |
| 上下文预算、注意力梯度、内存 | | 上下文效率得分 ≤3 |
| 工具错误、缺失工具、冗余工具 | | 工具健康度得分 ≤3 |
| 工具数量精简、未使用工具清理 | | 工具数量>7 或发现未使用的工具 |
| 安全漏洞、错误恢复、校验 | | 安全与可靠性得分 ≤3 |
| 测试覆盖率、黄金测试、评估 | | 无自动化测试或评估策略 |
| 架构边界、可观测性 | | 架构适配性得分 ≤3 |
请勿给出通用手动操作(例如「添加Vitest」、「创建回滚脚本」)而不指定用户应该运行哪个Maestro命令来实现该操作。推荐操作格式如下:
运行来[具体操作](解决[维度] #第几个差距){{command_prefix}}<command>
Scoring Guide
评分指南
| Score | Meaning | Maestro Action |
|---|---|---|
| 5 | Production-excellent | No action needed |
| 4 | Good with minor gaps | |
| 3 | Functional but risky | |
| 2 | Significant issues | |
| 1 | Broken or missing | |
| 分数 | 含义 | Maestro操作 |
|---|---|---|
| 5 | 生产级优秀 | 无需操作 |
| 4 | 良好,存在小幅差距 | 运行 |
| 3 | 可运行但存在风险 | 运行 |
| 2 | 存在严重问题 | 运行 |
| 1 | 不可用或缺失 | 运行 |
Diagnostic Checklist
诊断检查清单
- All 5 dimensions scored with specific evidence
- Critical findings listed in priority order
- Each finding includes specific file/component location
- Recommended actions reference specific Maestro commands (see Command Mapping above)
- Overall score calculated and report generated
- 所有5个维度都已打分并附带具体证据
- 关键发现按优先级排序
- 每个发现都包含具体的文件/组件位置
- 推荐操作引用了具体的Maestro命令(参考上方命令映射)
- 已计算总分并生成报告
Recommended Next Step
推荐下一步
After diagnosis, run the command mapped to your lowest-scoring dimension. For a general improvement sequence: → → .
{{command_prefix}}fortify{{command_prefix}}streamline{{command_prefix}}refineNEVER:
- Give all 5s unless the workflow is genuinely production-excellent
- Skip dimensions — score all 5 even if some seem fine
- Diagnose without reading the actual workflow code/config
- Recommend changes without specific findings to support them
- Give generic manual actions without mapping them to a Maestro command
诊断完成后,运行对应最低分维度的命令。通用改进顺序为: → → 。
{{command_prefix}}fortify{{command_prefix}}streamline{{command_prefix}}refine绝对禁止:
- 除非工作流确实达到生产级优秀标准,否则不要全打5分
- 不要跳过维度,哪怕有些看起来没问题也要给全部5个维度打分
- 未读取实际工作流代码/配置不要进行诊断
- 没有具体发现支撑不要推荐修改
- 不要给出未映射到Maestro命令的通用手动操作