codebase-onboarding

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Codebase Onboarding

代码库入职指南

Systematically analyze an unfamiliar codebase and produce a structured onboarding guide. Designed for developers joining a new project or setting up Claude Code in an existing repo for the first time.
系统性分析陌生代码库并生成结构化入职指南。专为加入新项目或首次在现有代码仓库中配置Claude Code的开发者设计。

When to Use

适用场景

  • First time opening a project with Claude Code
  • Joining a new team or repository
  • User asks "help me understand this codebase"
  • User asks to generate a CLAUDE.md for a project
  • User says "onboard me" or "walk me through this repo"
  • 首次使用Claude Code打开项目时
  • 加入新团队或新代码仓库时
  • 用户询问「帮我理解这个代码库」时
  • 用户要求为项目生成CLAUDE.md时
  • 用户说「带我熟悉这个仓库」或「引导我上手这个项目」时

How It Works

工作流程

Phase 1: Reconnaissance

第一阶段:侦察分析

Gather raw signals about the project without reading every file. Run these checks in parallel:
1. Package manifest detection
   → package.json, go.mod, Cargo.toml, pyproject.toml, pom.xml, build.gradle,
     Gemfile, composer.json, mix.exs, pubspec.yaml

2. Framework fingerprinting
   → next.config.*, nuxt.config.*, angular.json, vite.config.*,
     django settings, flask app factory, fastapi main, rails config

3. Entry point identification
   → main.*, index.*, app.*, server.*, cmd/, src/main/

4. Directory structure snapshot
   → Top 2 levels of the directory tree, ignoring node_modules, vendor,
     .git, dist, build, __pycache__, .next

5. Config and tooling detection
   → .eslintrc*, .prettierrc*, tsconfig.json, Makefile, Dockerfile,
     docker-compose*, .github/workflows/, .env.example, CI configs

6. Test structure detection
   → tests/, test/, __tests__/, *_test.go, *.spec.ts, *.test.js,
     pytest.ini, jest.config.*, vitest.config.*
无需阅读所有文件,收集项目的原始信息。并行执行以下检查:
1. 包清单检测
   → package.json, go.mod, Cargo.toml, pyproject.toml, pom.xml, build.gradle,
     Gemfile, composer.json, mix.exs, pubspec.yaml

2. 框架识别
   → next.config.*, nuxt.config.*, angular.json, vite.config.*,
     django settings, flask app factory, fastapi main, rails config

3. 入口点定位
   → main.*, index.*, app.*, server.*, cmd/, src/main/

4. 目录结构快照
   → 目录树的前两层,忽略node_modules, vendor,
     .git, dist, build, __pycache__, .next

5. 配置与工具检测
   → .eslintrc*, .prettierrc*, tsconfig.json, Makefile, Dockerfile,
     docker-compose*, .github/workflows/, .env.example, CI配置文件

6. 测试结构检测
   → tests/, test/, __tests__/, *_test.go, *.spec.ts, *.test.js,
     pytest.ini, jest.config.*, vitest.config.*

Phase 2: Architecture Mapping

第二阶段:架构映射

From the reconnaissance data, identify:
Tech Stack
  • Language(s) and version constraints
  • Framework(s) and major libraries
  • Database(s) and ORMs
  • Build tools and bundlers
  • CI/CD platform
Architecture Pattern
  • Monolith, monorepo, microservices, or serverless
  • Frontend/backend split or full-stack
  • API style: REST, GraphQL, gRPC, tRPC
Key Directories Map the top-level directories to their purpose:
<!-- Example for a React project — replace with detected directories -->
src/components/  → React UI components
src/api/         → API route handlers
src/lib/         → Shared utilities
src/db/          → Database models and migrations
tests/           → Test suites
scripts/         → Build and deployment scripts
Data Flow Trace one request from entry to response:
  • Where does a request enter? (router, handler, controller)
  • How is it validated? (middleware, schemas, guards)
  • Where is business logic? (services, models, use cases)
  • How does it reach the database? (ORM, raw queries, repositories)
基于侦察分析的数据,确定以下内容:
技术栈
  • 使用的编程语言及版本限制
  • 框架及主要依赖库
  • 数据库及ORM
  • 构建工具与打包器
  • CI/CD平台
架构模式
  • 单体应用、单体仓库、微服务或无服务架构
  • 前后端分离或全栈架构
  • API风格:REST、GraphQL、gRPC、tRPC
核心目录 映射顶层目录的用途:
<!-- React项目示例 — 替换为检测到的目录 -->
src/components/  → React UI组件
src/api/         → API路由处理器
src/lib/         → 共享工具库
src/db/          → 数据库模型与迁移文件
tests/           → 测试套件
scripts/         → 构建与部署脚本
数据流 追踪一个请求从进入到响应的完整流程:
  • 请求从何处进入?(路由、处理器、控制器)
  • 如何进行验证?(中间件、Schema、守卫)
  • 业务逻辑存放在哪里?(服务、模型、用例)
  • 如何与数据库交互?(ORM、原生查询、仓库模式)

Phase 3: Convention Detection

第三阶段:规范识别

Identify patterns the codebase already follows:
Naming Conventions
  • File naming: kebab-case, camelCase, PascalCase, snake_case
  • Component/class naming patterns
  • Test file naming:
    *.test.ts
    ,
    *.spec.ts
    ,
    *_test.go
Code Patterns
  • Error handling style: try/catch, Result types, error codes
  • Dependency injection or direct imports
  • State management approach
  • Async patterns: callbacks, promises, async/await, channels
Git Conventions
  • Branch naming from recent branches
  • Commit message style from recent commits
  • PR workflow (squash, merge, rebase)
  • If the repo has no commits yet or only a shallow history (e.g.
    git clone --depth 1
    ), skip this section and note "Git history unavailable or too shallow to detect conventions"
识别代码库已遵循的模式:
命名规范
  • 文件命名:短横线命名法(kebab-case)、小驼峰命名法(camelCase)、大驼峰命名法(PascalCase)、下划线命名法(snake_case)
  • 组件/类命名模式
  • 测试文件命名:
    *.test.ts
    ,
    *.spec.ts
    ,
    *_test.go
代码模式
  • 错误处理风格:try/catch、Result类型、错误码
  • 依赖注入或直接导入
  • 状态管理方案
  • 异步模式:回调、Promise、async/await、通道
Git规范
  • 从近期分支识别分支命名规则
  • 从近期提交识别提交信息风格
  • PR工作流( squash合并、普通合并、变基合并)
  • 如果仓库尚无提交或仅有浅层历史(如使用
    git clone --depth 1
    克隆),跳过此部分并标注「Git历史不可用或过浅,无法检测规范」

Phase 4: Generate Onboarding Artifacts

第四阶段:生成入职文档

Produce two outputs:
生成两类输出:

Output 1: Onboarding Guide

输出1:入职指南

markdown
undefined
markdown
undefined

Onboarding Guide: [Project Name]

入职指南:[项目名称]

Overview

项目概述

[2-3 sentences: what this project does and who it serves]
[2-3句话:项目功能及服务对象]

Tech Stack

技术栈

<!-- Example for a Next.js project — replace with detected stack -->
LayerTechnologyVersion
LanguageTypeScript5.x
FrameworkNext.js14.x
DatabasePostgreSQL16
ORMPrisma5.x
TestingJest + Playwright-
<!-- Next.js项目示例 — 替换为检测到的技术栈 -->
层级技术版本
编程语言TypeScript5.x
框架Next.js14.x
数据库PostgreSQL16
ORMPrisma5.x
测试Jest + Playwright-

Architecture

架构设计

[Diagram or description of how components connect]
[组件连接关系图或文字描述]

Key Entry Points

关键入口点

<!-- Example for a Next.js project — replace with detected paths -->
  • API routes:
    src/app/api/
    — Next.js route handlers
  • UI pages:
    src/app/(dashboard)/
    — authenticated pages
  • Database:
    prisma/schema.prisma
    — data model source of truth
  • Config:
    next.config.ts
    — build and runtime config
<!-- Next.js项目示例 — 替换为检测到的路径 -->
  • API路由
    src/app/api/
    — Next.js路由处理器
  • UI页面
    src/app/(dashboard)/
    — 认证页面
  • 数据库
    prisma/schema.prisma
    — 数据模型的唯一可信源
  • 配置文件
    next.config.ts
    — 构建与运行时配置

Directory Map

目录映射

[Top-level directory → purpose mapping]
[顶层目录 → 用途映射]

Request Lifecycle

请求生命周期

[Trace one API request from entry to response]
[追踪一个API请求从进入到响应的完整流程]

Conventions

编码规范

  • [File naming pattern]
  • [Error handling approach]
  • [Testing patterns]
  • [Git workflow]
  • [文件命名模式]
  • [错误处理方案]
  • [测试模式]
  • [Git工作流]

Common Tasks

常见任务

<!-- Example for a Node.js project — replace with detected commands -->
  • Run dev server:
    npm run dev
  • Run tests:
    npm test
  • Run linter:
    npm run lint
  • Database migrations:
    npx prisma migrate dev
  • Build for production:
    npm run build
<!-- Node.js项目示例 — 替换为检测到的命令 -->
  • 启动开发服务器
    npm run dev
  • 运行测试
    npm test
  • 运行代码检查
    npm run lint
  • 数据库迁移
    npx prisma migrate dev
  • 生产环境构建
    npm run build

Where to Look

快速导航

<!-- Example for a Next.js project — replace with detected paths -->
I want to...Look at...
Add an API endpoint
src/app/api/
Add a UI page
src/app/(dashboard)/
Add a database table
prisma/schema.prisma
Add a test
tests/
matching the source path
Change build config
next.config.ts
undefined
<!-- Next.js项目示例 — 替换为检测到的路径 -->
我想...前往...
添加API端点
src/app/api/
添加UI页面
src/app/(dashboard)/
添加数据库表
prisma/schema.prisma
添加测试用例与源码路径对应的
tests/
目录
修改构建配置
next.config.ts
undefined

Output 2: Starter CLAUDE.md

输出2:初始CLAUDE.md

Generate or update a project-specific CLAUDE.md based on detected conventions. If
CLAUDE.md
already exists, read it first and enhance it — preserve existing project-specific instructions and clearly call out what was added or changed.
markdown
undefined
基于检测到的规范生成或更新项目专属的CLAUDE.md。如果
CLAUDE.md
已存在,先读取现有内容再进行增强 — 保留现有项目的专属说明,并清晰标注新增或修改的内容。
markdown
undefined

Project Instructions

项目说明

Tech Stack

技术栈

[Detected stack summary]
[检测到的技术栈摘要]

Code Style

代码风格

  • [Detected naming conventions]
  • [Detected patterns to follow]
  • [检测到的命名规范]
  • [需遵循的代码模式]

Testing

测试相关

  • Run tests:
    [detected test command]
  • Test pattern: [detected test file convention]
  • Coverage: [if configured, the coverage command]
  • 运行测试:
    [检测到的测试命令]
  • 测试文件规范:[检测到的测试文件命名规则]
  • 覆盖率:[若已配置,填写覆盖率命令]

Build & Run

构建与运行

  • Dev:
    [detected dev command]
  • Build:
    [detected build command]
  • Lint:
    [detected lint command]
  • 开发环境:
    [检测到的开发命令]
  • 生产构建:
    [检测到的构建命令]
  • 代码检查:
    [检测到的代码检查命令]

Project Structure

项目结构

[Key directory → purpose map]
[核心目录 → 用途映射]

Conventions

约定规范

  • [Commit style if detectable]
  • [PR workflow if detectable]
  • [Error handling patterns]
undefined
  • [可检测到的提交风格]
  • [可检测到的PR工作流]
  • [错误处理模式]
undefined

Best Practices

最佳实践

  1. Don't read everything — reconnaissance should use Glob and Grep, not Read on every file. Read selectively only for ambiguous signals.
  2. Verify, don't guess — if a framework is detected from config but the actual code uses something different, trust the code.
  3. Respect existing CLAUDE.md — if one already exists, enhance it rather than replacing it. Call out what's new vs existing.
  4. Stay concise — the onboarding guide should be scannable in 2 minutes. Details belong in the code, not the guide.
  5. Flag unknowns — if a convention can't be confidently detected, say so rather than guessing. "Could not determine test runner" is better than a wrong answer.
  1. 无需通读所有文件 — 侦察分析应使用Glob和Grep工具,而非读取每个文件。仅在信息模糊时选择性读取文件。
  2. 验证而非猜测 — 如果从配置文件检测到某框架,但实际代码使用了其他框架,以代码为准。
  3. 尊重现有CLAUDE.md — 若已存在CLAUDE.md,应进行增强而非替换。明确标注新增内容与原有内容的区别。
  4. 保持简洁 — 入职指南应能在2分钟内快速浏览完毕。细节应保留在代码中,而非指南里。
  5. 标注未知内容 — 若无法确定某规范,应如实说明,而非猜测。例如标注「无法确定测试运行器」比错误答案更合适。

Anti-Patterns to Avoid

反模式规避

  • Generating a CLAUDE.md that's longer than 100 lines — keep it focused
  • Listing every dependency — highlight only the ones that shape how you write code
  • Describing obvious directory names —
    src/
    doesn't need an explanation
  • Copying the README — the onboarding guide adds structural insight the README lacks
  • 生成超过100行的CLAUDE.md — 保持内容聚焦
  • 列出所有依赖 — 仅突出影响代码编写的核心依赖
  • 解释显而易见的目录名称 —
    src/
    无需额外说明
  • 复制README内容 — 入职指南应提供README所没有的结构化洞察

Examples

示例

Example 1: First time in a new repo

示例1:首次进入新仓库

User: "Onboard me to this codebase" Action: Run full 4-phase workflow → produce Onboarding Guide + Starter CLAUDE.md Output: Onboarding Guide printed directly to the conversation, plus a
CLAUDE.md
written to the project root
用户:「带我熟悉这个代码库」 操作:执行完整的4阶段流程 → 生成入职指南 + 初始CLAUDE.md 输出:入职指南直接输出到对话中,同时在项目根目录生成
CLAUDE.md

Example 2: Generate CLAUDE.md for existing project

示例2:为现有项目生成CLAUDE.md

User: "Generate a CLAUDE.md for this project" Action: Run Phases 1-3, skip Onboarding Guide, produce only CLAUDE.md Output: Project-specific
CLAUDE.md
with detected conventions
用户:「为这个项目生成CLAUDE.md」 操作:执行第1-3阶段,跳过入职指南,仅生成CLAUDE.md 输出:符合项目规范的专属
CLAUDE.md

Example 3: Enhance existing CLAUDE.md

示例3:增强现有CLAUDE.md

User: "Update the CLAUDE.md with current project conventions" Action: Read existing CLAUDE.md, run Phases 1-3, merge new findings Output: Updated
CLAUDE.md
with additions clearly marked
用户:「根据当前项目规范更新CLAUDE.md」 操作:读取现有CLAUDE.md,执行第1-3阶段,合并新发现的内容 输出:更新后的
CLAUDE.md
,清晰标注新增内容