caveman-token-optimizer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Caveman Token Optimizer

穴居人Token优化器

Skill by ara.so — Daily 2026 Skills collection.
A Claude Code skill and Codex plugin that makes AI agents respond in compressed caveman-speak — cutting ~65% of output tokens on average (up to 87%) while keeping full technical accuracy. No pleasantries. No filler. Just answer.
技能作者:ara.so — 2026年度每日技能系列作品
一款Claude Code技能和Codex插件,可让AI Agent以压缩的穴居人风格话术回复,平均减少约65%的输出token(最高可达87%),同时完全保留技术准确性。没有客套话,没有冗余内容,只给答案。

What It Does

功能说明

Caveman mode strips:
  • Pleasantries: "Sure, I'd be happy to help!" → gone
  • Hedging: "It might be worth considering" → gone
  • Articles (a, an, the) → gone
  • Verbose transitions → gone
Caveman keeps:
  • All code blocks (written normally)
  • Technical terms (exact:
    useMemo
    ,
    polymorphism
    ,
    middleware
    )
  • Error messages (quoted exactly)
  • Git commits and PR descriptions (normal)
Same fix. 75% less word. Brain still big.
穴居人模式会删除以下内容:
  • 客套话术:"Sure, I'd be happy to help!" → 全部移除
  • 模糊表述:"It might be worth considering" → 全部移除
  • 冠词(a、an、the) → 全部移除
  • 冗长的过渡语 → 全部移除
穴居人模式会保留以下内容:
  • 所有代码块(保持正常书写格式)
  • 技术术语(完全保留:
    useMemo
    polymorphism
    middleware
    等)
  • 错误信息(完全原文引用)
  • Git提交和PR描述(保持正常格式)
同样的解决方案,字数少75%,智能程度不打折。

Install

安装方法

Claude Code (npx)

Claude Code(npx方式)

bash
npx skills add JuliusBrussee/caveman
bash
npx skills add JuliusBrussee/caveman

Claude Code (plugin system)

Claude Code(插件系统方式)

bash
claude plugin marketplace add JuliusBrussee/caveman
claude plugin install caveman@caveman
bash
claude plugin marketplace add JuliusBrussee/caveman
claude plugin install caveman@caveman

Codex

Codex

  1. Clone the repo
  2. Open Codex inside the repo
  3. Run
    /plugins
  4. Search
    Caveman
  5. Install plugin
Install once. Works in all sessions after that.
  1. 克隆仓库
  2. 在仓库目录中打开Codex
  3. 执行
    /plugins
    命令
  4. 搜索
    Caveman
  5. 安装插件
安装一次即可,后续所有会话都生效。

Manual / Local

手动/本地安装

bash
git clone https://github.com/JuliusBrussee/caveman.git
cd caveman
pip install -e .
bash
git clone https://github.com/JuliusBrussee/caveman.git
cd caveman
pip install -e .

Usage — Trigger Commands

使用方式 — 触发命令

Claude Code

Claude Code

/caveman          # enable default (full) caveman mode
/caveman lite     # professional brevity, grammar intact
/caveman full     # default — drop articles, use fragments
/caveman ultra    # maximum compression, telegraphic
/caveman          # 开启默认(完整)穴居人模式
/caveman lite     # 专业简洁风格,保留语法
/caveman full     # 默认模式 — 移除冠词,使用短句片段
/caveman ultra    # 极致压缩,电报式表述

Codex

Codex

$caveman
$caveman lite
$caveman full
$caveman ultra
$caveman
$caveman lite
$caveman full
$caveman ultra

Natural language triggers

自然语言触发

Any of these phrases activate caveman mode:
  • "talk like caveman"
  • "caveman mode"
  • "less tokens please"
  • "be concise"
以下任意短语都可以激活穴居人模式:
  • "talk like caveman"
  • "caveman mode"
  • "less tokens please"
  • "be concise"

Disable

关闭方式

/caveman off
/caveman off

or say: "stop caveman" / "normal mode"

或者说:"stop caveman" / "normal mode"


Level sticks until changed or session ends.

模式等级会保持生效,直到修改或者会话结束。

Intensity Levels

强度等级

LevelTriggerStyleExample
Lite
/caveman lite
Drop filler, keep grammar"Component re-renders because inline object prop creates new reference each cycle. Wrap in
useMemo
."
Full
/caveman full
Drop articles, use fragments"New object ref each render. Inline prop = new ref = re-render. Wrap in
useMemo
."
Ultra
/caveman ultra
Telegraphic, abbreviate everything"Inline obj prop → new ref → re-render.
useMemo
."
等级触发命令风格示例
Lite(轻度)
/caveman lite
移除冗余内容,保留语法"Component re-renders because inline object prop creates new reference each cycle. Wrap in
useMemo
."
Full(完整)
/caveman full
移除冠词,使用短句片段"New object ref each render. Inline prop = new ref = re-render. Wrap in
useMemo
."
Ultra(极致)
/caveman ultra
电报式表述,所有内容尽可能缩写"Inline obj prop → new ref → re-render.
useMemo
."

Benchmark Results

基准测试结果

Real token counts from Claude API (reproducible via
benchmarks/
directory):
TaskNormalCavemanSaved
React re-render bug118015987%
Auth middleware fix70412183%
PostgreSQL pool setup234738084%
Git rebase vs merge70229258%
Async/await refactor38730122%
Docker multi-stage build104229072%
Average121429465%
Important: Caveman only affects output tokens. Thinking/reasoning tokens are untouched. Caveman make mouth smaller, not brain.
来自Claude API的真实token计数(可通过
benchmarks/
目录复现):
任务正常模式穴居人模式节省比例
React重渲染bug修复118015987%
Auth中间件修复70412183%
PostgreSQL连接池设置234738084%
Git rebase vs merge对比70229258%
Async/await重构38730122%
Docker多阶段构建104229072%
平均121429465%
重要提示: 穴居人模式仅影响输出token,思考/推理token完全不受影响。只是让AI"说话"更简洁,不是降低思考能力。

Reproducing Benchmarks

复现基准测试

bash
git clone https://github.com/JuliusBrussee/caveman.git
cd caveman/benchmarks
bash
git clone https://github.com/JuliusBrussee/caveman.git
cd caveman/benchmarks

Set your Anthropic API key

设置你的Anthropic API密钥

export ANTHROPIC_API_KEY=your_key_here
export ANTHROPIC_API_KEY=your_key_here

Run benchmark suite

运行基准测试套件

python run_benchmarks.py
python run_benchmarks.py

Compare normal vs caveman responses

对比正常模式和穴居人模式的回复

python compare.py --task react-rerender python compare.py --task auth-middleware python compare.py --all
undefined
python compare.py --task react-rerender python compare.py --task auth-middleware python compare.py --all
undefined

Code Examples — What Caveman Mode Changes

代码示例 — 穴居人模式的变化

Before (normal, 69 tokens)

之前(正常模式,69个token)

The reason your React component is re-rendering is likely because
you're creating a new object reference on each render cycle. When
you pass an inline object as a prop, React's shallow comparison
sees it as a different object every time, which triggers a
re-render. I'd recommend using useMemo to memoize the object.
The reason your React component is re-rendering is likely because
you're creating a new object reference on each render cycle. When
you pass an inline object as a prop, React's shallow comparison
sees it as a different object every time, which triggers a
re-render. I'd recommend using useMemo to memoize the object.

After (caveman full, 19 tokens)

之后(完整穴居人模式,19个token)

New object ref each render. Inline object prop = new ref = re-render.
Wrap in `useMemo`.
New object ref each render. Inline object prop = new ref = re-render.
Wrap in `useMemo`.

Code blocks stay normal — caveman not stupid

代码块保持正常 — 穴居人模式不会改动代码

python
undefined
python
undefined

Caveman explains in grunt, but code stays clean:

穴居人用简洁话术解释,但代码保持整洁:

"Token expiry check broken. Fix:"

"Token expiry check broken. Fix:"

def verify_token(token: str) -> bool: payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"]) # Was: payload["exp"] < time.time() # Fix: return payload["exp"] >= time.time()
undefined
def verify_token(token: str) -> bool: payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"]) # Was: payload["exp"] < time.time() # Fix: return payload["exp"] >= time.time()
undefined

What Caveman Preserves vs. Removes

穴居人模式保留与移除内容说明

python
undefined
python
undefined

Tokens caveman REMOVES (waste):

Tokens caveman REMOVES (waste):

filler_phrases = [ "I'd be happy to help you with that", # 8 tokens gone "The reason this is happening is because", # 7 tokens gone "I would recommend that you consider", # 7 tokens gone "Sure, let me take a look at that", # 8 tokens gone "Great question!", # 2 tokens gone "Certainly!", # 1 token gone ]
filler_phrases = [ "I'd be happy to help you with that", # 8 tokens gone "The reason this is happening is because", # 7 tokens gone "I would recommend that you consider", # 7 tokens gone "Sure, let me take a look at that", # 8 tokens gone "Great question!", # 2 tokens gone "Certainly!", # 1 token gone ]

Things caveman KEEPS (substance):

Things caveman KEEPS (substance):

preserved = [ "code blocks", # always normal "technical_terms", # exact spelling preserved "error_messages", # quoted verbatim "variable_names", # exact "git_commits", # normal prose "pr_descriptions", # normal prose ]
undefined
preserved = [ "code blocks", # always normal "technical_terms", # exact spelling preserved "error_messages", # quoted verbatim "variable_names", # exact "git_commits", # normal prose "pr_descriptions", # normal prose ]
undefined

Integration Pattern — Using in a Project

集成方案 — 在项目中使用

If you want caveman-style compression in your own Claude API calls:
python
import anthropic

client = anthropic.Anthropic()  # uses ANTHROPIC_API_KEY env var
如果你想在自己的Claude API调用中使用穴居人风格压缩:
python
import anthropic

client = anthropic.Anthropic()  # uses ANTHROPIC_API_KEY env var

Load the caveman SKILL.md as a system prompt addition

Load the caveman SKILL.md as a system prompt addition

with open("path/to/caveman/SKILL.md", "r") as f: caveman_skill = f.read()
response = client.messages.create( model="claude-opus-4-5", max_tokens=1024, system=f"{caveman_skill}\n\nRespond in caveman mode: full intensity.", messages=[ {"role": "user", "content": "Why is my React component re-rendering?"} ] )
print(response.content[0].text)
with open("path/to/caveman/SKILL.md", "r") as f: caveman_skill = f.read()
response = client.messages.create( model="claude-opus-4-5", max_tokens=1024, system=f"{caveman_skill}\n\nRespond in caveman mode: full intensity.", messages=[ {"role": "user", "content": "Why is my React component re-rendering?"} ] )
print(response.content[0].text)

→ "New object ref each render. Inline prop = new ref = re-render. useMemo fix."

→ "New object ref each render. Inline prop = new ref = re-render. useMemo fix."

print(f"Tokens used: {response.usage.output_tokens}") # ~19 vs ~69
undefined
print(f"Tokens used: {response.usage.output_tokens}") # ~19 vs ~69
undefined

Session Workflow

会话使用流程

undefined
undefined

Start session with caveman

Start session with caveman

/caveman full
/caveman full

Ask technical questions normally — agent responds in caveman

Ask technical questions normally — agent responds in caveman

Why does my Docker build take so long? → "Layer cache miss. COPY before RUN npm install. Fix order:" [code block shown normally]
Why does my Docker build take so long? → "Layer cache miss. COPY before RUN npm install. Fix order:" [code block shown normally]

Switch intensity mid-session

Switch intensity mid-session

/caveman lite
/caveman lite

Turn off for PR description writing

Turn off for PR description writing

/caveman off
Write a PR description for this auth fix → [normal, professional prose]
/caveman off
Write a PR description for this auth fix → [normal, professional prose]

Back to caveman

Back to caveman

/caveman
undefined
/caveman
undefined

Troubleshooting

故障排查

Caveman mode not activating:
bash
undefined
穴居人模式未激活:
bash
undefined

Verify plugin installed

Verify plugin installed

claude plugin list | grep caveman
claude plugin list | grep caveman

Reinstall

Reinstall

claude plugin remove caveman claude plugin install caveman@caveman

**Savings lower than expected:**
- Caveman only compresses *output* tokens — input tokens unchanged
- Tasks with heavy code output (like Docker setup) see less savings since code is preserved verbatim
- Reasoning/thinking tokens not affected — savings show in visible response only
- Ultra mode gets maximum compression; switch if full mode feels verbose

**Need normal mode for specific output:**
/caveman off # for PR descriptions, user-facing docs, formal reports /caveman # re-enable after

**Benchmarking your own tasks:**
```bash
cd benchmarks/
export ANTHROPIC_API_KEY=your_key_here
python run_benchmarks.py --task "your custom task description"
claude plugin remove caveman claude plugin install caveman@caveman

**节省比例低于预期:**
- 穴居人模式仅压缩*输出*token — 输入token无变化
- 代码输出占比高的任务(比如Docker配置)节省比例较低,因为代码会完全保留
- 推理/思考token不受影响 — 仅可见回复部分有节省
- 极致模式可获得最大压缩效果,如果完整模式感觉还是冗长可以切换

**特定输出需要正常模式时:**
/caveman off # for PR descriptions, user-facing docs, formal reports /caveman # re-enable after

**为自己的任务做基准测试:**
```bash
cd benchmarks/
export ANTHROPIC_API_KEY=your_key_here
python run_benchmarks.py --task "your custom task description"

Why It Works

原理说明

Backed by a March 2026 paper "Brevity Constraints Reverse Performance Hierarchies in Language Models": constraining large models to brief responses improved accuracy by 26 percentage points on certain benchmarks. Verbose not always better.
TOKENS SAVED          ████████ 65% avg (up to 87%)
TECHNICAL ACCURACY    ████████ 100%
RESPONSE SPEED        ████████ faster (less to generate)
READABILITY           ████████ better (no wall of text)
基于2026年3月的论文《Brevity Constraints Reverse Performance Hierarchies in Language Models》:限制大模型输出简短回复,在部分基准测试中准确率提升了26个百分点。冗长不一定更好。
TOKENS SAVED          ████████ 65% avg (up to 87%)
TECHNICAL ACCURACY    ████████ 100%
RESPONSE SPEED        ████████ faster (less to generate)
READABILITY           ████████ better (no wall of text)

Key Files

核心文件

caveman/
├── SKILL.md          # the skill definition loaded by Claude Code
├── benchmarks/
│   ├── run_benchmarks.py   # reproduce token count results
│   └── compare.py          # side-by-side comparison tool
├── plugin.json             # Codex plugin manifest
└── README.md
caveman/
├── SKILL.md          # the skill definition loaded by Claude Code
├── benchmarks/
│   ├── run_benchmarks.py   # reproduce token count results
│   └── compare.py          # side-by-side comparison tool
├── plugin.json             # Codex plugin manifest
└── README.md

Links

链接


One rock. That it. 🪨

一块石头,搞定。 🪨