Skill
4
Agent
All Skills
Search
Tools
中文
|
EN
Explore
Loading...
Back to Details
agentic-engineering
Compare original and translation side by side
🇺🇸
Original
English
🇨🇳
Translation
Chinese
Agentic Engineering
Agentic 工程
Use this skill for engineering workflows where AI agents perform most implementation work and humans enforce quality and risk controls.
本技能适用于AI Agent承担大部分实现工作、人类负责质量与风险管控的工程工作流。
Operating Principles
运作原则
Define completion criteria before execution.
Decompose work into agent-sized units.
Route model tiers by task complexity.
Measure with evals and regression checks.
执行前先定义完成标准
将工作拆解为适合Agent处理的单元
根据任务复杂度匹配不同层级的模型
通过评估和回归检查来衡量效果
Eval-First Loop
评估优先循环
Define capability eval and regression eval.
Run baseline and capture failure signatures.
Execute implementation.
Re-run evals and compare deltas.
定义能力评估和回归评估规则
运行基准测试并记录失败特征
执行代码实现
重新运行评估并对比差异
Task Decomposition
任务拆解
Apply the 15-minute unit rule:
each unit should be independently verifiable
each unit should have a single dominant risk
each unit should expose a clear done condition
遵循15分钟单元规则:
每个单元都可独立验证
每个单元仅存在一个主要风险
每个单元有明确的完成条件
Model Routing
模型路由
Haiku: classification, boilerplate transforms, narrow edits
Sonnet: implementation and refactors
Opus: architecture, root-cause analysis, multi-file invariants
Haiku:分类、样板代码转换、小范围修改
Sonnet:代码实现与重构
Opus:架构设计、根因分析、多文件一致性校验
Session Strategy
会话策略
Continue session for closely-coupled units.
Start fresh session after major phase transitions.
Compact after milestone completion, not during active debugging.
关联性强的工作单元延续同一会话
重大阶段转换后开启全新会话
里程碑完成后再压缩会话,活跃调试期间不要压缩
Review Focus for AI-Generated Code
AI生成代码的审核重点
Prioritize:
invariants and edge cases
error boundaries
security and auth assumptions
hidden coupling and rollout risk
Do not waste review cycles on style-only disagreements when automated format/lint already enforce style.
优先关注:
一致性规则与边界情况
错误边界处理
安全与鉴权假设
隐式耦合和上线风险
如果已有自动化格式化/lint工具强制规范代码风格,不要在仅关乎风格的分歧上浪费审核精力。
Cost Discipline
成本管控
Track per task:
model
token estimate
retries
wall-clock time
success/failure
Escalate model tier only when lower tier fails with a clear reasoning gap.
每个任务都要追踪:
使用的模型
Token预估消耗量
重试次数
实际耗时
成功/失败状态
仅当低层级模型因明确的推理能力不足失败时,再升级使用更高层级的模型。