cost-optimizer

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Cost Optimizer

成本优化器

Tracks cumulative LLM costs across DAG execution and makes real-time decisions to stay within budget: downgrade models, skip optional nodes, or stop early.

在DAG执行过程中跟踪LLM累计成本，并实时做出决策以控制在预算范围内：降级模型、跳过可选节点或提前终止。

When to Use

适用场景

✅ Use for:

Setting and enforcing cost budgets for DAG executions
Real-time cost monitoring during execution
Deciding when to downgrade model tiers mid-execution
Identifying which nodes are most expensive and why
Post-execution cost analysis and optimization recommendations

❌ NOT for:

Choosing which model to use per node (use
```
llm-router
```
)
Provider pricing comparisons (static data, not a skill)
Billing or invoicing features

✅ 适用情况：

为DAG执行设置并执行成本预算
执行过程中的实时成本监控
决定在执行过程中何时降级模型层级
识别哪些节点成本最高及原因
执行后的成本分析与优化建议

❌ 不适用情况：

为每个节点选择使用哪种模型（请使用
```
llm-router
```
）
供应商定价对比（静态数据，不属于本技能范畴）
账单或发票功能

Budget Enforcement Process

预算执行流程

mermaid

flowchart TD
  N[Node about to execute] --> C[Check: spent + estimated_node_cost vs budget]
  C --> S{Within budget?}
  S -->|Yes, &gt;20% remaining| E[Execute at planned model tier]
  S -->|Yes, &lt;20% remaining| W[Execute but downgrade to Tier 1 if possible]
  S -->|No| D{Node optional?}
  D -->|Yes| SK[Skip node]
  D -->|No| H{Human gate available?}
  H -->|Yes| A[Ask human: continue over budget?]
  H -->|No| ST[Stop execution, return partial results]

mermaid

flowchart TD
  N[Node about to execute] --> C[Check: spent + estimated_node_cost vs budget]
  C --> S{Within budget?}
  S -->|Yes, &gt;20% remaining| E[Execute at planned model tier]
  S -->|Yes, &lt;20% remaining| W[Execute but downgrade to Tier 1 if possible]
  S -->|No| D{Node optional?}
  D -->|Yes| SK[Skip node]
  D -->|No| H{Human gate available?}
  H -->|Yes| A[Ask human: continue over budget?]
  H -->|No| ST[Stop execution, return partial results]

Budget Tiers

预算层级

Budget Remaining	Action
>50%	Execute at planned model tier
20-50%	Log warning. Continue at planned tier.
10-20%	Downgrade remaining Tier 2 nodes to Tier 1 (Haiku)
5-10%	Downgrade ALL remaining nodes to Tier 1. Skip optional nodes.
<5%	Stop execution unless next node is critical path
0%	Stop. Return partial results with cost breakdown.

剩余预算占比	操作
>50%	按计划的模型层级执行
20-50%	记录警告，继续按计划层级执行
10-20%	将剩余的Tier 2节点降级为Tier 1（Haiku）
5-10%	将所有剩余节点降级为Tier 1，跳过可选节点
<5%	终止执行，除非下一个节点是关键路径
0%	终止执行，返回包含成本明细的部分结果

Cost Estimation Per Node

单节点成本估算

Before each node executes, estimate its cost:

estimated_cost = (avg_input_tokens × input_price + avg_output_tokens × output_price)

Use historical averages for this skill + model combination. If no history, use defaults:

Tier 1 (Haiku): ~800 input + 400 output = ~$0.001
Tier 2 (Sonnet): ~2000 input + 1000 output = ~$0.012
Tier 3 (Opus): ~3000 input + 1500 output = ~$0.16

在每个节点执行前，估算其成本：

estimated_cost = (avg_input_tokens × input_price + avg_output_tokens × output_price)

使用本技能与模型组合的历史平均值。如果没有历史数据，使用默认值：

Tier 1（Haiku）：约800输入 + 400输出 = 约$0.001
Tier 2（Sonnet）：约2000输入 + 1000输出 = 约$0.012
Tier 3（Opus）：约3000输入 + 1500输出 = 约$0.16

Post-Execution Report

执行后报告

yaml

cost_report:
  total_budget: 0.50
  total_spent: 0.37
  budget_remaining: 0.13
  nodes_executed: 8
  nodes_skipped: 1
  nodes_downgraded: 2
  model_breakdown:
    haiku: { calls: 4, cost: 0.004 }
    sonnet: { calls: 3, cost: 0.036 }
    opus: { calls: 1, cost: 0.33 }
  savings_recommendations:
    - "Node 'deep-analysis' used Opus ($0.33) but downstream accepted on first try. Try Sonnet next time — potential saving: $0.32"
    - "Nodes 'validate-a' and 'validate-b' are sequential but independent. Parallelize to reduce wall-clock time."

yaml

cost_report:
  total_budget: 0.50
  total_spent: 0.37
  budget_remaining: 0.13
  nodes_executed: 8
  nodes_skipped: 1
  nodes_downgraded: 2
  model_breakdown:
    haiku: { calls: 4, cost: 0.004 }
    sonnet: { calls: 3, cost: 0.036 }
    opus: { calls: 1, cost: 0.33 }
  savings_recommendations:
    - "Node 'deep-analysis' used Opus ($0.33) but downstream accepted on first try. Try Sonnet next time — potential saving: $0.32"
    - "Nodes 'validate-a' and 'validate-b' are sequential but independent. Parallelize to reduce wall-clock time."

Anti-Patterns

反模式

No Budget at All

完全不设置预算

Wrong: Running DAGs without any cost tracking until the API bill arrives. Right: Every DAG execution has a budget, even if generous. Track spend per node.

错误做法：运行DAG时不进行任何成本跟踪，直到收到API账单才发现问题。 正确做法：每次DAG执行都要有预算，即使预算额度较高。跟踪每个节点的支出。

Aggressive Downgrading

过度激进降级

Wrong: Downgrading Opus nodes to Haiku at 50% budget remaining, causing quality failures that trigger expensive retries. Right: Only downgrade when the alternative is stopping execution. Retries cost more than the original model tier.

错误做法：当剩余预算为50%时就将Opus节点降级为Haiku，导致质量问题触发昂贵的重试。 正确做法：仅在面临终止执行的情况下才降级。重试的成本高于原模型层级的成本。

Ignoring Retries in Cost

忽略重试成本

Wrong: Budgeting for one attempt per node. Right: Budget for avg_retries × cost_per_attempt. A node with 3 retries on Sonnet costs $0.036, not $0.012.

错误做法：仅为每个节点的单次尝试做预算。 正确做法：按avg_retries × cost_per_attempt做预算。一个在Sonnet上重试3次的节点成本为$0.036，而非$0.012。