token-budget-advisor

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Token Budget Advisor (TBA)

Token预算顾问(TBA)

Intercept the response flow to offer the user a choice about response depth before Claude answers.
在Claude回答前拦截响应流程,为用户提供响应深度的选择。

When to Use

使用场景

  • User wants to control how long or detailed a response is
  • User mentions tokens, budget, depth, or response length
  • User says "short version", "tldr", "brief", "al 25%", "exhaustive", etc.
  • Any time the user wants to choose depth/detail level upfront
Do not trigger when: user already set a level this session (maintain it silently), or the answer is trivially one line.
  • 用户想要控制响应的长度或详细程度
  • 用户提及token、预算、深度或响应长度相关内容
  • 用户说出"short version"、"tldr"、"brief"、"al 25%"、"exhaustive"等表述
  • 任何用户想要提前选择响应深度/详细程度的场景
禁止触发:用户已在当前会话设置过响应级别(直接静默维持该设置即可),或者回答明显只需要一行内容。

How It Works

工作原理

Step 1 — Estimate input tokens

步骤1 — 估算输入token数量

Use the repository's canonical context-budget heuristics to estimate the prompt's token count mentally.
Use the same calibration guidance as context-budget:
  • prose:
    words × 1.3
  • code-heavy or mixed/code blocks:
    chars / 4
For mixed content, use the dominant content type and keep the estimate heuristic.
使用仓库规范的上下文预算启发式算法,大致估算提示词的token数量。
使用和context-budget一致的校准规则:
  • 散文类内容:
    单词数 × 1.3
  • 代码密集或混合代码块的内容:
    字符数 / 4
混合内容按占比最高的内容类型使用对应估算规则即可。

Step 2 — Estimate response size by complexity

步骤2 — 按复杂度估算响应大小

Classify the prompt, then apply the multiplier range to get the full response window:
ComplexityMultiplier rangeExample prompts
Simple3× – 8×"What is X?", yes/no, single fact
Medium8× – 20×"How does X work?"
Medium-High10× – 25×Code request with context
Complex15× – 40×Multi-part analysis, comparisons, architecture
Creative10× – 30×Stories, essays, narrative writing
Response window =
input_tokens × mult_min
to
input_tokens × mult_max
(but don’t exceed your model’s configured output-token limit).
对提示词进行分类,然后应用乘数区间得到完整响应窗口:
复杂度等级乘数区间示例提示词
简单3× – 8×"X是什么?",是非问题,单一事实类问题
中等8× – 20×"X的工作原理是什么?"
中高10× – 25×带上下文的代码请求
复杂15× – 40×多部分分析、对比、架构类问题
创意类10× – 30×故事、散文、叙事类写作
响应窗口 =
输入token数 × 最小乘数
输入token数 × 最大乘数
(但不能超过你使用的模型配置的输出token上限)。

Step 3 — Present depth options

步骤3 — 展示深度选项

Present this block before answering, using the actual estimated numbers:
Analyzing your prompt...

Input: ~[N] tokens  |  Type: [type]  |  Complexity: [level]  |  Language: [lang]

Choose your depth level:

[1] Essential   (25%)  ->  ~[tokens]   Direct answer only, no preamble
[2] Moderate    (50%)  ->  ~[tokens]   Answer + context + 1 example
[3] Detailed    (75%)  ->  ~[tokens]   Full answer with alternatives
[4] Exhaustive (100%)  ->  ~[tokens]   Everything, no limits

Which level? (1-4 or say "25% depth", "50% depth", "75% depth", "100% depth")

Precision: heuristic estimate ~85-90% accuracy (±15%).
Level token estimates (within the response window):
  • 25% →
    min + (max - min) × 0.25
  • 50% →
    min + (max - min) × 0.50
  • 75% →
    min + (max - min) × 0.75
  • 100% →
    max
在回答前展示以下内容块,使用实际估算出的数值替换占位符:
Analyzing your prompt...

Input: ~[N] tokens  |  Type: [type]  |  Complexity: [level]  |  Language: [lang]

Choose your depth level:

[1] Essential   (25%)  ->  ~[tokens]   Direct answer only, no preamble
[2] Moderate    (50%)  ->  ~[tokens]   Answer + context + 1 example
[3] Detailed    (75%)  ->  ~[tokens]   Full answer with alternatives
[4] Exhaustive (100%)  ->  ~[tokens]   Everything, no limits

Which level? (1-4 or say "25% depth", "50% depth", "75% depth", "100% depth")

Precision: heuristic estimate ~85-90% accuracy (±15%).
各等级的token估算值(在响应窗口范围内计算):
  • 25% →
    最小值 + (最大值 - 最小值) × 0.25
  • 50% →
    最小值 + (最大值 - 最小值) × 0.50
  • 75% →
    最小值 + (最大值 - 最小值) × 0.75
  • 100% →
    最大值

Step 4 — Respond at the chosen level

步骤4 — 按选定等级响应

LevelTarget lengthIncludeOmit
25% Essential2-4 sentences maxDirect answer, key conclusionContext, examples, nuance, alternatives
50% Moderate1-3 paragraphsAnswer + necessary context + 1 exampleDeep analysis, edge cases, references
75% DetailedStructured responseMultiple examples, pros/cons, alternativesExtreme edge cases, exhaustive references
100% ExhaustiveNo restrictionEverything — full analysis, all code, all perspectivesNothing
等级目标长度包含内容省略内容
25% 核心版最多2-4句话直接答案、核心结论背景、示例、细节差异、替代方案
50% 适中版1-3个段落答案+必要背景+1个示例深度分析、边界情况、参考资料
75% 详细版结构化响应多个示例、优缺点、替代方案极端边界情况、全部参考资料
100% 详尽版无限制全部内容 — 完整分析、所有代码、所有视角

Shortcuts — skip the question

快捷方式 — 跳过选择询问

If the user already signals a level, respond at that level immediately without asking:
What they sayLevel
"1" / "25% depth" / "short version" / "brief answer" / "tldr"25%
"2" / "50% depth" / "moderate depth" / "balanced answer"50%
"3" / "75% depth" / "detailed answer" / "thorough answer"75%
"4" / "100% depth" / "exhaustive answer" / "full deep dive"100%
If the user set a level earlier in the session, maintain it silently for subsequent responses unless they change it.
如果用户已经明确表示了期望的等级,直接按对应等级响应,无需额外询问:
用户表述对应等级
"1" / "25% depth" / "short version" / "brief answer" / "tldr"25%
"2" / "50% depth" / "moderate depth" / "balanced answer"50%
"3" / "75% depth" / "detailed answer" / "thorough answer"75%
"4" / "100% depth" / "exhaustive answer" / "full deep dive"100%
如果用户在会话早期已经设置过响应等级,后续响应静默维持该等级即可,除非用户主动修改。

Precision note

精度说明

This skill uses heuristic estimation — no real tokenizer. Accuracy ~85-90%, variance ±15%. Always show the disclaimer.
本skill使用启发式估算,没有调用实际的分词器,准确率约85-90%,误差±15%,请务必展示免责声明。

Examples

示例

Triggers

触发场景

  • "Give me the short version first."
  • "How many tokens will your answer use?"
  • "Respond at 50% depth."
  • "I want the exhaustive answer, not the summary."
  • "Dame la version corta y luego la detallada."
  • "先给我简短版本的答案。"
  • "你的回答会用多少个token?"
  • "按50%深度响应。"
  • "我要详尽的回答,不要总结。"
  • "Dame la version corta y luego la detallada."

Does Not Trigger

不触发场景

  • "What is a JWT token?"
  • "The checkout flow uses a payment token."
  • "Is this normal?"
  • "Complete the refactor."
  • Follow-up questions after the user already chose a depth for the session
  • "什么是JWT token?"
  • "结账流程使用支付token。"
  • "这正常吗?"
  • "完成重构。"
  • 用户已经为当前会话选择过响应深度后的后续追问

Source

来源

Standalone skill from TBA — Token Budget Advisor for Claude Code. Original project also ships a Python estimator script, but this repository keeps the skill self-contained and heuristic-only.
独立skill来自TBA — Token Budget Advisor for Claude Code。原项目还附带了一个Python估算脚本,但本仓库将该skill做了独立封装,仅保留启发式估算逻辑。