codebase-packager
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTLDR Expert
TLDR 专家
Overview
概述
Achieves high-fidelity codebase comprehension at a fraction of the token cost through semantic layers, structured digests, and advanced context packaging. Combines Repomix for context packing, Gitingest for repository digests, and llm-tldr for graph-based code analysis.
When to use: Reducing prompt overhead for large codebases, onboarding to unfamiliar repositories, mapping cross-file dependencies, creating AI-optimized context bundles.
When NOT to use: Small single-file tasks, final implementation debugging (read the full file), real-time code editing.
通过语义层、结构化摘要与高级上下文打包,以极低的Token成本实现高保真的代码库理解。整合了用于上下文打包的Repomix、用于仓库摘要的Gitingest,以及用于基于图的代码分析的llm-tldr。
适用场景: 为大型代码库降低提示词开销、快速上手陌生仓库、映射跨文件依赖、创建AI优化的上下文包。
不适用场景: 小型单文件任务、最终实现调试(需读取完整文件)、实时代码编辑。
Quick Reference
快速参考
| Pattern | Tool / Command | Key Points |
|---|---|---|
| Context packing | | Package subdirectories into AI-optimized bundles |
| Signatures only | | Compression extracts signatures via Tree-sitter |
| Repository digest | | Prompt-friendly summary for quick onboarding |
| Dependency context | | LLM-ready context for a function with 95% token saving |
| Caller tracing | | Reverse call graph to assess change blast radius |
| Forward call graph | | Build forward call graph across the project |
| Semantic search | | Find logic by meaning when naming is inconsistent |
| Architecture audit | | Detect circular deps, layer violations, dead code |
| Dead code finder | | Find unreachable functions with zero callers |
| File extraction | | Extract AST (functions, classes, imports) from a file |
| Secret scanning | Repomix built-in secretlint | Ensure context bundles contain no keys or PII |
| 模式 | 工具/命令 | 核心要点 |
|---|---|---|
| 上下文打包 | | 将子目录打包为AI优化的上下文包 |
| 仅获取签名 | | 压缩功能通过Tree-sitter提取签名 |
| 仓库摘要生成 | | 便于在提示词中使用的摘要,助力快速上手仓库 |
| 依赖上下文获取 | | 为函数生成适配LLM的上下文,可减少95%的Token消耗 |
| 调用方追踪 | | 生成反向调用图,评估代码变更的影响范围 |
| 正向调用图构建 | | 构建跨项目的正向调用图 |
| 语义搜索 | | 当命名不规范时,通过语义查找相关逻辑 |
| 架构审计 | | 检测循环依赖、层级违规与死代码 |
| 死代码查找 | | 查找无调用方的不可达函数 |
| 文件内容提取 | | 从文件中提取AST(函数、类、导入项) |
| 密钥扫描 | Repomix内置secretlint | 确保上下文包中不包含密钥或个人身份信息(PII) |
Common Mistakes
常见误区
| Mistake | Correct Pattern |
|---|---|
| Reading entire large files without checking structure first | Run |
Using | Use |
Packing | Configure Repomix ignore-list to exclude generated and vendor directories |
| Assuming semantic search results are exhaustive | Verify top matches against actual source and cross-reference with |
| Running Repomix without compression on large directories | Use |
| Including irrelevant context that dilutes signal quality | Follow top-down priority: index, signatures, core logic, then adjacent context |
| 误区 | 正确做法 |
|---|---|
| 未先检查结构就读取整个大文件 | 在读取完整文件前,先运行 |
使用 | 使用 |
将 | 配置Repomix忽略列表,排除生成目录与第三方依赖目录 |
| 认为语义搜索结果是全面的 | 将顶部匹配结果与实际源码对比,并使用 |
| 不对大型目录启用压缩就运行Repomix | 使用 |
| 包含无关上下文,降低信号质量 | 遵循自上而下的优先级:索引、签名、核心逻辑,再到相邻上下文 |
Delegation
任务委派
- Repository structure discovery: Use agent to map directory layout and identify key modules before building context bundles
Explore - Multi-step context packing workflow: Use agent to run Gitingest digest, Repomix compression, and llm-tldr indexing in sequence
Task - Architecture analysis and planning: Use agent to design context engineering strategy for large monorepos
Plan
- 仓库结构发现:在构建上下文包之前,使用代理映射目录布局并识别关键模块
Explore - 多步上下文打包流程:使用代理按顺序运行Gitingest摘要生成、Repomix压缩与llm-tldr索引
Task - 架构分析与规划:使用代理为大型单体仓库设计上下文工程策略
Plan
References
参考资料
- Context Engineering Patterns -- packing strategies, XML tagging, signal-to-noise optimization, warm-up prompts
- Repomix and Gitingest Mastery -- configuration, compression mode, digest generation, Tree-sitter extraction
- Semantic Graph Analysis -- llm-tldr CLI tools, impact analysis, semantic search, architectural audits
- 上下文工程模式 —— 打包策略、XML标记、信噪比优化、预热提示词
- Repomix与Gitingest精通指南 —— 配置、压缩模式、摘要生成、Tree-sitter提取
- 语义图分析 —— llm-tldr CLI工具、影响分析、语义搜索、架构审计