cost-optimizer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCost Optimizer
成本优化器
Tracks cumulative LLM costs across DAG execution and makes real-time decisions to stay within budget: downgrade models, skip optional nodes, or stop early.
在DAG执行过程中跟踪LLM累计成本,并实时做出决策以控制在预算范围内:降级模型、跳过可选节点或提前终止。
When to Use
适用场景
✅ Use for:
- Setting and enforcing cost budgets for DAG executions
- Real-time cost monitoring during execution
- Deciding when to downgrade model tiers mid-execution
- Identifying which nodes are most expensive and why
- Post-execution cost analysis and optimization recommendations
❌ NOT for:
- Choosing which model to use per node (use )
llm-router - Provider pricing comparisons (static data, not a skill)
- Billing or invoicing features
✅ 适用情况:
- 为DAG执行设置并执行成本预算
- 执行过程中的实时成本监控
- 决定在执行过程中何时降级模型层级
- 识别哪些节点成本最高及原因
- 执行后的成本分析与优化建议
❌ 不适用情况:
- 为每个节点选择使用哪种模型(请使用)
llm-router - 供应商定价对比(静态数据,不属于本技能范畴)
- 账单或发票功能
Budget Enforcement Process
预算执行流程
mermaid
flowchart TD
N[Node about to execute] --> C[Check: spent + estimated_node_cost vs budget]
C --> S{Within budget?}
S -->|Yes, >20% remaining| E[Execute at planned model tier]
S -->|Yes, <20% remaining| W[Execute but downgrade to Tier 1 if possible]
S -->|No| D{Node optional?}
D -->|Yes| SK[Skip node]
D -->|No| H{Human gate available?}
H -->|Yes| A[Ask human: continue over budget?]
H -->|No| ST[Stop execution, return partial results]mermaid
flowchart TD
N[Node about to execute] --> C[Check: spent + estimated_node_cost vs budget]
C --> S{Within budget?}
S -->|Yes, >20% remaining| E[Execute at planned model tier]
S -->|Yes, <20% remaining| W[Execute but downgrade to Tier 1 if possible]
S -->|No| D{Node optional?}
D -->|Yes| SK[Skip node]
D -->|No| H{Human gate available?}
H -->|Yes| A[Ask human: continue over budget?]
H -->|No| ST[Stop execution, return partial results]Budget Tiers
预算层级
| Budget Remaining | Action |
|---|---|
| >50% | Execute at planned model tier |
| 20-50% | Log warning. Continue at planned tier. |
| 10-20% | Downgrade remaining Tier 2 nodes to Tier 1 (Haiku) |
| 5-10% | Downgrade ALL remaining nodes to Tier 1. Skip optional nodes. |
| <5% | Stop execution unless next node is critical path |
| 0% | Stop. Return partial results with cost breakdown. |
| 剩余预算占比 | 操作 |
|---|---|
| >50% | 按计划的模型层级执行 |
| 20-50% | 记录警告,继续按计划层级执行 |
| 10-20% | 将剩余的Tier 2节点降级为Tier 1(Haiku) |
| 5-10% | 将所有剩余节点降级为Tier 1,跳过可选节点 |
| <5% | 终止执行,除非下一个节点是关键路径 |
| 0% | 终止执行,返回包含成本明细的部分结果 |
Cost Estimation Per Node
单节点成本估算
Before each node executes, estimate its cost:
estimated_cost = (avg_input_tokens × input_price + avg_output_tokens × output_price)Use historical averages for this skill + model combination. If no history, use defaults:
- Tier 1 (Haiku): ~800 input + 400 output = ~$0.001
- Tier 2 (Sonnet): ~2000 input + 1000 output = ~$0.012
- Tier 3 (Opus): ~3000 input + 1500 output = ~$0.16
在每个节点执行前,估算其成本:
estimated_cost = (avg_input_tokens × input_price + avg_output_tokens × output_price)使用本技能与模型组合的历史平均值。如果没有历史数据,使用默认值:
- Tier 1(Haiku):约800输入 + 400输出 = 约$0.001
- Tier 2(Sonnet):约2000输入 + 1000输出 = 约$0.012
- Tier 3(Opus):约3000输入 + 1500输出 = 约$0.16
Post-Execution Report
执行后报告
yaml
cost_report:
total_budget: 0.50
total_spent: 0.37
budget_remaining: 0.13
nodes_executed: 8
nodes_skipped: 1
nodes_downgraded: 2
model_breakdown:
haiku: { calls: 4, cost: 0.004 }
sonnet: { calls: 3, cost: 0.036 }
opus: { calls: 1, cost: 0.33 }
savings_recommendations:
- "Node 'deep-analysis' used Opus ($0.33) but downstream accepted on first try. Try Sonnet next time — potential saving: $0.32"
- "Nodes 'validate-a' and 'validate-b' are sequential but independent. Parallelize to reduce wall-clock time."yaml
cost_report:
total_budget: 0.50
total_spent: 0.37
budget_remaining: 0.13
nodes_executed: 8
nodes_skipped: 1
nodes_downgraded: 2
model_breakdown:
haiku: { calls: 4, cost: 0.004 }
sonnet: { calls: 3, cost: 0.036 }
opus: { calls: 1, cost: 0.33 }
savings_recommendations:
- "Node 'deep-analysis' used Opus ($0.33) but downstream accepted on first try. Try Sonnet next time — potential saving: $0.32"
- "Nodes 'validate-a' and 'validate-b' are sequential but independent. Parallelize to reduce wall-clock time."Anti-Patterns
反模式
No Budget at All
完全不设置预算
Wrong: Running DAGs without any cost tracking until the API bill arrives.
Right: Every DAG execution has a budget, even if generous. Track spend per node.
错误做法:运行DAG时不进行任何成本跟踪,直到收到API账单才发现问题。
正确做法:每次DAG执行都要有预算,即使预算额度较高。跟踪每个节点的支出。
Aggressive Downgrading
过度激进降级
Wrong: Downgrading Opus nodes to Haiku at 50% budget remaining, causing quality failures that trigger expensive retries.
Right: Only downgrade when the alternative is stopping execution. Retries cost more than the original model tier.
错误做法:当剩余预算为50%时就将Opus节点降级为Haiku,导致质量问题触发昂贵的重试。
正确做法:仅在面临终止执行的情况下才降级。重试的成本高于原模型层级的成本。
Ignoring Retries in Cost
忽略重试成本
Wrong: Budgeting for one attempt per node.
Right: Budget for avg_retries × cost_per_attempt. A node with 3 retries on Sonnet costs $0.036, not $0.012.
错误做法:仅为每个节点的单次尝试做预算。
正确做法:按avg_retries × cost_per_attempt做预算。一个在Sonnet上重试3次的节点成本为$0.036,而非$0.012。