token-budget-advisor

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Token Budget Advisor (TBA)

Token预算顾问（TBA）

Intercept the response flow to offer the user a choice about response depth before Claude answers.

在Claude回答前拦截响应流程，为用户提供响应深度的选择。

When to Use

使用场景

User wants to control how long or detailed a response is
User mentions tokens, budget, depth, or response length
User says "short version", "tldr", "brief", "al 25%", "exhaustive", etc.
Any time the user wants to choose depth/detail level upfront

Do not trigger when: user already set a level this session (maintain it silently), or the answer is trivially one line.

用户想要控制响应的长度或详细程度
用户提及token、预算、深度或响应长度相关内容
用户说出"short version"、"tldr"、"brief"、"al 25%"、"exhaustive"等表述
任何用户想要提前选择响应深度/详细程度的场景

禁止触发：用户已在当前会话设置过响应级别（直接静默维持该设置即可），或者回答明显只需要一行内容。

How It Works

工作原理

Step 1 — Estimate input tokens

步骤1 — 估算输入token数量

Use the repository's canonical context-budget heuristics to estimate the prompt's token count mentally.

Use the same calibration guidance as context-budget:

prose:
```
words × 1.3
```
code-heavy or mixed/code blocks:
```
chars / 4
```

For mixed content, use the dominant content type and keep the estimate heuristic.

使用仓库规范的上下文预算启发式算法，大致估算提示词的token数量。

使用和context-budget一致的校准规则：

散文类内容：
```
单词数 × 1.3
```
代码密集或混合代码块的内容：
```
字符数 / 4
```

混合内容按占比最高的内容类型使用对应估算规则即可。

Step 2 — Estimate response size by complexity

步骤2 — 按复杂度估算响应大小

Classify the prompt, then apply the multiplier range to get the full response window:

Complexity	Multiplier range	Example prompts
Simple	3× – 8×	"What is X?", yes/no, single fact
Medium	8× – 20×	"How does X work?"
Medium-High	10× – 25×	Code request with context
Complex	15× – 40×	Multi-part analysis, comparisons, architecture
Creative	10× – 30×	Stories, essays, narrative writing

Response window =

input_tokens × mult_min

input_tokens × mult_max

(but don’t exceed your model’s configured output-token limit).

对提示词进行分类，然后应用乘数区间得到完整响应窗口：

复杂度等级	乘数区间	示例提示词
简单	3× – 8×	"X是什么？"，是非问题，单一事实类问题
中等	8× – 20×	"X的工作原理是什么？"
中高	10× – 25×	带上下文的代码请求
复杂	15× – 40×	多部分分析、对比、架构类问题
创意类	10× – 30×	故事、散文、叙事类写作

响应窗口 =

输入token数 × 最小乘数

到

输入token数 × 最大乘数

（但不能超过你使用的模型配置的输出token上限）。

Step 3 — Present depth options

步骤3 — 展示深度选项

Present this block before answering, using the actual estimated numbers:

Analyzing your prompt...

Input: ~[N] tokens  |  Type: [type]  |  Complexity: [level]  |  Language: [lang]

Choose your depth level:

[1] Essential   (25%)  ->  ~[tokens]   Direct answer only, no preamble
[2] Moderate    (50%)  ->  ~[tokens]   Answer + context + 1 example
[3] Detailed    (75%)  ->  ~[tokens]   Full answer with alternatives
[4] Exhaustive (100%)  ->  ~[tokens]   Everything, no limits

Which level? (1-4 or say "25% depth", "50% depth", "75% depth", "100% depth")

Precision: heuristic estimate ~85-90% accuracy (±15%).

Level token estimates (within the response window):

25% →
```
min + (max - min) × 0.25
```
50% →
```
min + (max - min) × 0.50
```
75% →
```
min + (max - min) × 0.75
```
100% →
```
max
```

在回答前展示以下内容块，使用实际估算出的数值替换占位符：

Analyzing your prompt...

Input: ~[N] tokens  |  Type: [type]  |  Complexity: [level]  |  Language: [lang]

Choose your depth level:

[1] Essential   (25%)  ->  ~[tokens]   Direct answer only, no preamble
[2] Moderate    (50%)  ->  ~[tokens]   Answer + context + 1 example
[3] Detailed    (75%)  ->  ~[tokens]   Full answer with alternatives
[4] Exhaustive (100%)  ->  ~[tokens]   Everything, no limits

Which level? (1-4 or say "25% depth", "50% depth", "75% depth", "100% depth")

Precision: heuristic estimate ~85-90% accuracy (±15%).

各等级的token估算值（在响应窗口范围内计算）：

25% →

最小值 + (最大值 - 最小值) × 0.25

50% →

最小值 + (最大值 - 最小值) × 0.50

75% →

最小值 + (最大值 - 最小值) × 0.75

100% →
```
最大值
```

Step 4 — Respond at the chosen level

步骤4 — 按选定等级响应

Level	Target length	Include	Omit
25% Essential	2-4 sentences max	Direct answer, key conclusion	Context, examples, nuance, alternatives
50% Moderate	1-3 paragraphs	Answer + necessary context + 1 example	Deep analysis, edge cases, references
75% Detailed	Structured response	Multiple examples, pros/cons, alternatives	Extreme edge cases, exhaustive references
100% Exhaustive	No restriction	Everything — full analysis, all code, all perspectives	Nothing

等级	目标长度	包含内容	省略内容
25% 核心版	最多2-4句话	直接答案、核心结论	背景、示例、细节差异、替代方案
50% 适中版	1-3个段落	答案+必要背景+1个示例	深度分析、边界情况、参考资料
75% 详细版	结构化响应	多个示例、优缺点、替代方案	极端边界情况、全部参考资料
100% 详尽版	无限制	全部内容 — 完整分析、所有代码、所有视角	无

Shortcuts — skip the question

快捷方式 — 跳过选择询问

If the user already signals a level, respond at that level immediately without asking:

What they say	Level
"1" / "25% depth" / "short version" / "brief answer" / "tldr"	25%
"2" / "50% depth" / "moderate depth" / "balanced answer"	50%
"3" / "75% depth" / "detailed answer" / "thorough answer"	75%
"4" / "100% depth" / "exhaustive answer" / "full deep dive"	100%

If the user set a level earlier in the session, maintain it silently for subsequent responses unless they change it.

如果用户已经明确表示了期望的等级，直接按对应等级响应，无需额外询问：

用户表述	对应等级
"1" / "25% depth" / "short version" / "brief answer" / "tldr"	25%
"2" / "50% depth" / "moderate depth" / "balanced answer"	50%
"3" / "75% depth" / "detailed answer" / "thorough answer"	75%
"4" / "100% depth" / "exhaustive answer" / "full deep dive"	100%

如果用户在会话早期已经设置过响应等级，后续响应静默维持该等级即可，除非用户主动修改。

Precision note

精度说明

This skill uses heuristic estimation — no real tokenizer. Accuracy ~85-90%, variance ±15%. Always show the disclaimer.

本skill使用启发式估算，没有调用实际的分词器，准确率约85-90%，误差±15%，请务必展示免责声明。

Examples

示例

Triggers

触发场景

"Give me the short version first."
"How many tokens will your answer use?"
"Respond at 50% depth."
"I want the exhaustive answer, not the summary."
"Dame la version corta y luego la detallada."

"先给我简短版本的答案。"
"你的回答会用多少个token？"
"按50%深度响应。"
"我要详尽的回答，不要总结。"
"Dame la version corta y luego la detallada."

Does Not Trigger

不触发场景

"What is a JWT token?"
"The checkout flow uses a payment token."
"Is this normal?"
"Complete the refactor."
Follow-up questions after the user already chose a depth for the session

"什么是JWT token？"
"结账流程使用支付token。"
"这正常吗？"
"完成重构。"
用户已经为当前会话选择过响应深度后的后续追问

Source

来源

Standalone skill from TBA — Token Budget Advisor for Claude Code. Original project also ships a Python estimator script, but this repository keeps the skill self-contained and heuristic-only.

独立skill来自TBA — Token Budget Advisor for Claude Code。原项目还附带了一个Python估算脚本，但本仓库将该skill做了独立封装，仅保留启发式估算逻辑。