Loading...
Loading...
Compare original and translation side by side
/assess backend/app/services/auth.py
/assess our caching strategy
/assess the current database schema
/assess frontend/src/components/Dashboard/assess backend/app/services/auth.py
/assess our caching strategy
/assess the current database schema
/assess frontend/src/components/DashboardAskUserQuestion(
questions=[{
"question": "What dimensions to assess?",
"header": "Dimensions",
"options": [
{"label": "Full assessment (Recommended)", "description": "All dimensions: quality, maintainability, security, performance"},
{"label": "Code quality only", "description": "Readability, complexity, best practices"},
{"label": "Security focus", "description": "Vulnerabilities, attack surface, compliance"},
{"label": "Quick score", "description": "Just give me a 0-10 score with brief notes"}
],
"multiSelect": false
}]
)AskUserQuestion(
questions=[{
"question": "What dimensions to assess?",
"header": "Dimensions",
"options": [
{"label": "Full assessment (Recommended)", "description": "All dimensions: quality, maintainability, security, performance"},
{"label": "Code quality only", "description": "Readability, complexity, best practices"},
{"label": "Security focus", "description": "Vulnerabilities, attack surface, compliance"},
{"label": "Quick score", "description": "Just give me a 0-10 score with brief notes"}
],
"multiSelect": false
}]
)ORCHESTKIT_PREFER_TEAMS=1| Aspect | Task Tool | Agent Teams |
|---|---|---|
| Score calibration | Lead normalizes independently | Assessors discuss disagreements |
| Cross-dimension findings | Lead correlates after completion | Security assessor alerts performance assessor of overlap |
| Cost | ~200K tokens | ~500K tokens |
| Best for | Quick scores, single dimension | Full multi-dimensional assessment |
Fallback: If Agent Teams encounters issues, fall back to Task tool for remaining assessment.
ORCHESTKIT_PREFER_TEAMS=1| 维度 | Task工具 | Agent Teams |
|---|---|---|
| 分数校准 | 负责人独立标准化 | 评估人员讨论分歧 |
| 跨维度发现 | 负责人在完成后关联 | 安全评估人员向性能评估人员预警重叠问题 |
| 成本 | ~200K tokens | ~500K tokens |
| 最佳适用场景 | 快速打分、单维度评估 | 全维度综合评估 |
降级方案:若Agent Teams出现问题,剩余评估环节切换为Task工具模式。
undefinedundefined
---
---| Question | How It's Answered |
|---|---|
| "Is this good?" | Quality score 0-10 with reasoning |
| "What are the trade-offs?" | Structured pros/cons list |
| "Should we change this?" | Improvement suggestions with effort |
| "What are the alternatives?" | Comparison with scores |
| "Where should we focus?" | Prioritized recommendations |
| 问题 | 回答方式 |
|---|---|
| 「这个好不好?」 | 给出0-10分的质量评分及理由 |
| 「有哪些权衡?」 | 结构化的优缺点列表 |
| 「我们应该改进吗?」 | 附带实施成本的改进建议 |
| 「有哪些替代方案?」 | 带评分的方案对比 |
| 「我们应该重点关注什么?」 | 优先级明确的建议 |
| Phase | Activities | Output |
|---|---|---|
| 1. Target Understanding | Read code/design, identify scope | Context summary |
| 2. Quality Rating | 6-dimension scoring (0-10) | Scores with reasoning |
| 3. Pros/Cons Analysis | Strengths and weaknesses | Balanced evaluation |
| 4. Alternative Comparison | Score alternatives | Comparison matrix |
| 5. Improvement Suggestions | Actionable recommendations | Prioritized list |
| 6. Effort Estimation | Time and complexity estimates | Effort breakdown |
| 7. Assessment Report | Compile findings | Final report |
| 阶段 | 活动内容 | 输出 |
|---|---|---|
| 1. 目标理解 | 读取代码/设计,确定评估范围 | 上下文摘要 |
| 2. 质量评级 | 6维度打分(0-10) | 带理由的评分结果 |
| 3. 优缺点分析 | 梳理优势与不足 | 平衡的评估结果 |
| 4. 替代方案对比 | 为替代方案打分 | 对比矩阵 |
| 5. 改进建议 | 可落地的优化建议 | 优先级列表 |
| 6. 成本估算 | 时间与复杂度估算 | 成本拆分 |
| 7. 评估报告 | 整合所有发现 | 最终报告 |
undefinedundefined
---
---| Dimension | Weight | What It Measures |
|---|---|---|
| Correctness | 0.20 | Does it work correctly? |
| Maintainability | 0.20 | Easy to understand/modify? |
| Performance | 0.15 | Efficient, no bottlenecks? |
| Security | 0.15 | Follows best practices? |
| Scalability | 0.15 | Handles growth? |
| Testability | 0.15 | Easy to test? |
run_in_background=True| 维度 | 权重 | 评估内容 |
|---|---|---|
| 正确性 | 0.20 | 是否能正常工作? |
| 可维护性 | 0.20 | 是否易于理解/修改? |
| 性能 | 0.15 | 是否高效、无瓶颈? |
| 安全性 | 0.15 | 是否遵循最佳实践? |
| 可扩展性 | 0.15 | 是否能应对业务增长? |
| 可测试性 | 0.15 | 是否易于测试? |
run_in_background=TrueTeamCreate(team_name="assess-{target-slug}", description="Assess {target}")
Task(subagent_type="code-quality-reviewer", name="correctness-assessor",
team_name="assess-{target-slug}",
prompt="""Assess CORRECTNESS (0-10) and MAINTAINABILITY (0-10) for: {target}
When you find issues that affect security, message security-assessor.
When you find issues that affect performance, message perf-assessor.
Share your scores with all teammates for calibration — if scores diverge
significantly (>2 points), discuss the disagreement.""")
Task(subagent_type="security-auditor", name="security-assessor",
team_name="assess-{target-slug}",
prompt="""Assess SECURITY (0-10) for: {target}
When correctness-assessor flags security-relevant patterns, investigate deeper.
When you find performance-impacting security measures, message perf-assessor.
Share your score and flag any cross-dimension trade-offs.""")
Task(subagent_type="performance-engineer", name="perf-assessor",
team_name="assess-{target-slug}",
prompt="""Assess PERFORMANCE (0-10) and SCALABILITY (0-10) for: {target}
When security-assessor flags performance trade-offs, evaluate the impact.
When you find testability issues (hard-to-benchmark code), message test-assessor.
Share your scores with reasoning for the composite calculation.""")
Task(subagent_type="test-generator", name="test-assessor",
team_name="assess-{target-slug}",
prompt="""Assess TESTABILITY (0-10) for: {target}
Evaluate test coverage, test quality, and ease of testing.
When other assessors flag dimension-specific concerns, verify test coverage
for those areas. Share your score and any coverage gaps found.""")SendMessage(type="shutdown_request", recipient="correctness-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="security-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="perf-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="test-assessor", content="Assessment complete")
TeamDelete()Fallback: If team formation fails, use standard Phase 2 Task spawns above.
TeamCreate(team_name="assess-{target-slug}", description="Assess {target}")
Task(subagent_type="code-quality-reviewer", name="correctness-assessor",
team_name="assess-{target-slug}",
prompt="""Assess CORRECTNESS (0-10) and MAINTAINABILITY (0-10) for: {target}
When you find issues that affect security, message security-assessor.
When you find issues that affect performance, message perf-assessor.
Share your scores with all teammates for calibration — if scores diverge
significantly (>2 points), discuss the disagreement.""")
Task(subagent_type="security-auditor", name="security-assessor",
team_name="assess-{target-slug}",
prompt="""Assess SECURITY (0-10) for: {target}
When correctness-assessor flags security-relevant patterns, investigate deeper.
When you find performance-impacting security measures, message perf-assessor.
Share your score and flag any cross-dimension trade-offs.""")
Task(subagent_type="performance-engineer", name="perf-assessor",
team_name="assess-{target-slug}",
prompt="""Assess PERFORMANCE (0-10) and SCALABILITY (0-10) for: {target}
When security-assessor flags performance trade-offs, evaluate the impact.
When you find testability issues (hard-to-benchmark code), message test-assessor.
Share your scores with reasoning for the composite calculation.""")
Task(subagent_type="test-generator", name="test-assessor",
team_name="assess-{target-slug}",
prompt="""Assess TESTABILITY (0-10) for: {target}
Evaluate test coverage, test quality, and ease of testing.
When other assessors flag dimension-specific concerns, verify test coverage
for those areas. Share your score and any coverage gaps found.""")SendMessage(type="shutdown_request", recipient="correctness-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="security-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="perf-assessor", content="Assessment complete")
SendMessage(type="shutdown_request", recipient="test-assessor", content="Assessment complete")
TeamDelete()降级方案:若团队创建失败,使用上述标准的阶段2任务生成方式。
undefinedundefined| # | Strength | Impact | Evidence |
|---|---|---|---|
| 1 | [strength] | High/Med/Low | [example] |
| 序号 | 优势 | 影响程度 | 证据 |
|---|---|---|---|
| 1 | [优势内容] | 高/中/低 | [示例] |
| # | Weakness | Severity | Evidence |
|---|---|---|---|
| 1 | [weakness] | High/Med/Low | [example] |
---| 序号 | 劣势 | 严重程度 | 证据 |
|---|---|---|---|
| 1 | [劣势内容] | 高/中/低 | [示例] |
---| Criteria | Current | Alternative A | Alternative B |
|---|---|---|---|
| Composite | [N.N] | [N.N] | [N.N] |
| Migration Effort | N/A | [1-5] | [1-5] |
| 评估标准 | 当前方案 | 替代方案A | 替代方案B |
|---|---|---|---|
| 综合得分 | [N.N] | [N.N] | [N.N] |
| 迁移成本 | N/A | [1-5] | [1-5] |
| Suggestion | Effort (1-5) | Impact (1-5) | Priority (I/E) |
|---|---|---|---|
| [action] | [N] | [N] | [ratio] |
| 建议内容 | 实施成本(1-5) | 影响程度(1-5) | 优先级(影响/成本) |
|---|---|---|---|
| [行动项] | [N] | [N] | [比值] |
| Timeframe | Tasks | Total |
|---|---|---|
| Quick wins (< 1hr) | [list] | X min |
| Short-term (< 1 day) | [list] | X hrs |
| Medium-term (1-3 days) | [list] | X days |
| 时间范围 | 任务内容 | 总耗时 |
|---|---|---|
| 快速落地(<1小时) | [任务列表] | X分钟 |
| 短期(<1天) | [任务列表] | X小时 |
| 中期(1-3天) | [任务列表] | X天 |
undefinedundefined
---
---| Score | Grade | Verdict |
|---|---|---|
| 9.0-10.0 | A+ | EXCELLENT |
| 8.0-8.9 | A | GOOD |
| 7.0-7.9 | B | GOOD |
| 6.0-6.9 | C | ADEQUATE |
| 5.0-5.9 | D | NEEDS WORK |
| 0.0-4.9 | F | CRITICAL |
| 分数 | 等级 | 结论 |
|---|---|---|
| 9.0-10.0 | A+ | 优秀 |
| 8.0-8.9 | A | 良好 |
| 7.0-7.9 | B | 良好 |
| 6.0-6.9 | C | 合格 |
| 5.0-5.9 | D | 需要改进 |
| 0.0-4.9 | F | 严重问题 |
| Decision | Choice | Rationale |
|---|---|---|
| 6 dimensions | Comprehensive coverage | All quality aspects without overwhelming |
| 0-10 scale | Industry standard | Easy to understand and compare |
| Parallel assessment | 6 agents | Fast, thorough evaluation |
| Effort/Impact scoring | 1-5 scale | Simple prioritization math |
| 决策内容 | 选择 | 理由 |
|---|---|---|
| 6个评估维度 | 全面覆盖 | 涵盖所有质量维度且不会过于复杂 |
| 0-10分制 | 行业标准 | 易于理解和对比 |
| 并行评估 | 6个Agent | 快速、全面的评估 |
| 成本/影响评分 | 1-5分制 | 简单的优先级计算方式 |
assess-complexityverifycode-review-playbookquality-gatesassess-complexityverifycode-review-playbookquality-gates