codebase-recon
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCodebase Recon
代码库侦察
Analyze git history to understand a codebase before reading any code. Reveals project health, risk areas, team structure, and development momentum.
Inspired by "The Git Commands I Run Before Reading Any Code" by Ally Piechowski.
在阅读任何代码之前,通过分析git历史记录来了解代码库。可揭示项目健康状况、风险区域、团队结构和开发势头。
灵感来自Ally Piechowski(GitHub:grepsedawk)的文章《The Git Commands I Run Before Reading Any Code》(链接:https://piechowski.io/post/git-commands-before-reading-code/)。
Phase 1: Probe
阶段1:探测
Before running analysis, determine repo scale to calibrate time windows and result counts.
Run this single shell command to collect repo vitals:
sh
echo "COMMITS=$(git rev-list --count HEAD)" && \
echo "FIRST_COMMIT=$(git log --reverse --format='%ad' --date=short | head -1)" && \
echo "LATEST_COMMIT=$(git log --format='%ad' --date=short | head -1)" && \
echo "BRANCHES=$(git branch -a | wc -l | tr -d ' ')"Use the commit count to set parameters for Phase 2:
| Repo Size | Commits | | |
|---|---|---|---|
| Small | <500 | (omit --since) | 10 |
| Medium | 500-10k | | 20 |
| Large | >10k | | 30 |
Print the Repo Vitals line immediately:
Repo Vitals: Age: [FIRST_COMMIT to LATEST_COMMIT] | Commits: [COMMITS] | Branches: [BRANCHES] | Analysis window: [WINDOW or "all time"]在运行分析之前,先确定代码库规模,以便校准时间窗口和结果数量。
运行以下单个shell命令来收集代码库关键指标:
sh
echo "COMMITS=$(git rev-list --count HEAD)" && \
echo "FIRST_COMMIT=$(git log --reverse --format='%ad' --date=short | head -1)" && \
echo "LATEST_COMMIT=$(git log --format='%ad' --date=short | head -1)" && \
echo "BRANCHES=$(git branch -a | wc -l | tr -d ' ')"根据提交次数设置阶段2的参数:
| 代码库规模 | 提交次数 | | |
|---|---|---|---|
| 小型 | <500 | (省略--since) | 10 |
| 中型 | 500-10k | | 20 |
| 大型 | >10k | | 30 |
立即打印代码库关键指标行:
Repo Vitals: Age: [FIRST_COMMIT to LATEST_COMMIT] | Commits: [COMMITS] | Branches: [BRANCHES] | Analysis window: [WINDOW or "all time"]Phase 2: Parallel Analysis
阶段2:并行分析
Run all 7 commands in parallel (they are independent). Substitute and from Phase 1. For small repos, omit flags entirely.
WINDOWN--since并行运行所有7个命令(它们相互独立)。替换阶段1中的和参数。对于小型代码库,完全省略标志。
WINDOWN--since2a. Code Hotspots
2a. 代码热点
Most-changed files in the analysis window:
sh
git log --format=format: --name-only --since="WINDOW" | sort | uniq -c | sort -nr | head -N分析时间窗口内变更最频繁的文件:
sh
git log --format=format: --name-only --since="WINDOW" | sort | uniq -c | sort -nr | head -N2b. Bus Factor
2b. 关键人员风险(Bus Factor)
All-time contributor ranking by commit count:
sh
git shortlog -sn --no-merges按提交次数统计的历史贡献者排名:
sh
git shortlog -sn --no-merges2c. Bug Magnets
2c. Bug高发文件
Files most associated with bug-fix commits:
sh
git log -i -E --grep="fix|bug|broken" --name-only --format='' --since="WINDOW" | sort | uniq -c | sort -nr | head -N与bug修复提交关联最多的文件:
sh
git log -i -E --grep="fix|bug|broken" --name-only --format='' --since="WINDOW" | sort | uniq -c | sort -nr | head -N2d. Team Momentum
2d. 团队开发势头
Commit frequency by month (all time):
sh
git log --format='%ad' --date=format:'%Y-%m' | sort | uniq -c按月份统计的提交频率(全时段):
sh
git log --format='%ad' --date=format:'%Y-%m' | sort | uniq -c2e. Firefighting Frequency
2e. 紧急修复频率
Emergency/revert commits in the analysis window:
sh
git log --oneline --since="WINDOW" | grep -iE 'revert|hotfix|emergency|rollback'分析时间窗口内的紧急/回滚提交:
sh
git log --oneline --since="WINDOW" | grep -iE 'revert|hotfix|emergency|rollback'2f. Recently Added Files
2f. 近期新增文件
New files added in the analysis window:
sh
git log --diff-filter=A --since="WINDOW" --name-only --format='' | sort | uniq -c | sort -nr | head -N分析时间窗口内新增的文件:
sh
git log --diff-filter=A --since="WINDOW" --name-only --format='' | sort | uniq -c | sort -nr | head -N2g. Active vs Total Contributors
2g. 活跃贡献者 vs 总贡献者
Count of contributors active in the last 3 months (fixed window — measures "who's here now"):
sh
git shortlog -sn --no-merges --since="3 months ago" | wc -lCompare this count against the total from 2b.
最近3个月内活跃的贡献者数量(固定窗口——衡量“当前参与人员”):
sh
git shortlog -sn --no-merges --since="3 months ago" | wc -l将此数量与2b中的总贡献者数量进行对比。
Cross-Referencing
交叉验证
After collecting all Phase 2 results, perform these cross-references before presenting the report:
- High-Risk Files: Intersect code hotspots (2a) with bug magnets (2c). Files appearing in both lists are highest-risk.
- Risk Ownership: For each high-risk file, run to identify the primary owner.
git shortlog -sn -- <file> - Bus Factor Risk: If active contributors (2g) are less than 30% of total contributors (2b), flag this as a bus factor concern.
- Momentum Trend: Analyze the monthly commit counts (2d):
- Compare the average of the last 3 months to the average of the 3 months before that.
- Rising: last 3 months average > prior 3 months average by 20%+
- Declining: last 3 months average < prior 3 months average by 20%+
- Erratic: month-over-month variance exceeds 50%
- Stable: otherwise
收集完阶段2的所有结果后,在生成报告前执行以下交叉验证:
- 高风险文件:将代码热点(2a)与Bug高发文件(2c)取交集。同时出现在两个列表中的文件风险最高。
- 风险归属:对每个高风险文件,运行来确定主要负责人。
git shortlog -sn -- <file> - 关键人员风险(Bus Factor):如果活跃贡献者(2g)数量少于总贡献者(2b)的30%,则标记为关键人员风险问题。
- 开发势头趋势:分析月度提交统计(2d):
- 对比最近3个月的平均值与之前3个月的平均值。
- 上升:最近3个月平均值比前3个月平均值高出20%以上
- 下降:最近3个月平均值比前3个月平均值低20%以上
- 波动:月度间差异超过50%
- 稳定:其他情况
Report Template
报告模板
Present the report in the terminal using this structure:
═══ Codebase Recon Report ═══
Repo Vitals
Age: [first commit] to [latest commit] | Commits: N | Branches: N | Analysis window: WINDOW
1. Code Hotspots (most-changed files)
[ranked list: count filepath]
2. Bug Magnets (files with fix/bug/broken commits)
[ranked list: count filepath]
3. High-Risk Files (appear in BOTH hotspots AND bug magnets)
[list with: filepath — hotspot rank #X, bug magnet rank #Y, primary owner: NAME]
If none overlap, state: "No files appear in both lists — good sign."
4. Bus Factor
[top 10 contributors: count name]
Active (last 3 months): X of Y total contributors
[If active < 30% of total: "Warning: low active contributor ratio — knowledge concentration risk"]
5. Team Momentum
[monthly commit counts, most recent 12 months or all if fewer]
Trend: [rising / stable / declining / erratic]
6. Firefighting Frequency
[list of revert/hotfix/emergency commits, or "None found"]
Rate: N emergency commits out of M total in window (X%)
7. Recently Added Files
[ranked list: count filepath]
8. Recommendations
- Start reading: [top 3 high-risk files, or top 3 hotspots if no high-risk files]
- Talk to: [primary owner of the #1 high-risk or hotspot file]
- Watch out: [any trend warnings — declining momentum, low bus factor, high firefighting rate]After printing the report, ask:
"Want me to save this report to a markdown file? (e.g.,)"docs/codebase-recon-report.md
If yes, write the same content as a markdown file. Do not commit — let the user decide.
在终端中使用以下结构展示报告:
═══ 代码库侦察报告 ═══
代码库关键指标
存续时间:[首次提交日期] 至 [最新提交日期] | 提交次数:N | 分支数量:N | 分析窗口:WINDOW
1. 代码热点(变更最频繁的文件)
[排名列表:变更次数 文件路径]
2. Bug高发文件(关联fix/bug/broken提交的文件)
[排名列表:关联次数 文件路径]
3. 高风险文件(同时出现在热点和Bug高发列表中)
[列表格式:文件路径 — 热点排名#X,Bug高发排名#Y,主要负责人:姓名]
如果没有重叠文件,说明:"无重叠文件——良好信号。"
4. 关键人员风险(Bus Factor)
[前10名贡献者:提交次数 姓名]
活跃贡献者(近3个月):总Y人中的X人
[如果活跃人数<总人数的30%:"警告:活跃贡献者占比低——知识集中风险"]
5. 团队开发势头
[月度提交统计,最近12个月或全部(如果不足12个月)]
趋势:[上升 / 稳定 / 下降 / 波动]
6. 紧急修复频率
[回滚/热修复/紧急提交列表,或"未发现"]
占比:窗口内M次总提交中的N次紧急提交(X%)
7. 近期新增文件
[排名列表:新增次数 文件路径]
8. 建议
- 优先阅读:[前3个高风险文件,若无则为前3个代码热点文件]
- 沟通对象:[排名第1的高风险或热点文件的主要负责人]
- 注意事项:[任何趋势警告——开发势头下降、关键人员风险高、紧急修复频率高]打印报告后,询问:
"是否需要将此报告保存为markdown文件?(例如:)"docs/codebase-recon-report.md
如果用户同意,将相同内容写入markdown文件。无需提交——由用户决定是否提交。