architecture-md
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseARCHITECTURE.md Generator
ARCHITECTURE.md 生成工具
Generate high-quality ARCHITECTURE.md files that give newcomers a mental map of a codebase.
Based on matklad's article: the biggest contributor bottleneck is not writing code, it's
figuring out where to change it. ARCHITECTURE.md bridges that gap.
生成高质量的ARCHITECTURE.md文件,为新成员提供代码库的心智地图。基于matklad的文章:贡献者面临的最大瓶颈并非编写代码,而是搞清楚要在哪里修改代码。ARCHITECTURE.md正是用来填补这一空白的。
Core Principles
核心原则
- Short and stable -- Only describe things unlikely to change frequently. Don't synchronize with code. Revisit a couple of times a year.
- Bird's eye first -- Start with the problem being solved, not the solution.
- Codemap over prose -- Answer "where's the thing that does X?" and "what does this thing do?" for every module.
- Name, don't link -- Name important files, modules, types. Don't hyperlink (links go stale). Encourage symbol search.
- Invariants are gold -- Explicitly call out what's deliberately absent. Important invariants are often expressed as absence, and are impossible to divine from reading code.
- Mark boundaries -- API boundaries between layers constrain all possible implementations behind them. Finding a boundary by randomly reading code is hard.
- Cross-cutting concerns last -- After the codemap, address things that are everywhere and nowhere (error handling, testing, config).
- 简洁且稳定 —— 仅描述不太可能频繁变更的内容。无需与代码同步更新。每年重新审视几次即可。
- 先总览全局 —— 从项目要解决的问题入手,而非直接讲解决方案。
- 代码地图优先 —— 为每个模块解答「实现X功能的代码在哪里?」以及「这个模块的作用是什么?」。
- 只命名,不链接 —— 列出重要的文件、模块、类型名称。不要使用超链接(链接会失效)。鼓励使用符号搜索。
- 不变量是黄金 —— 明确指出那些刻意省略的内容。重要的不变量通常以“不存在”的形式体现,仅通过阅读代码无法推断出来。
- 标记边界 —— 各层级之间的API边界会约束其背后的所有可能实现。通过随机阅读代码很难找到这些边界。
- 横切关注点后置 —— 在代码地图之后,再处理那些无处不在又无迹可寻的内容(错误处理、测试、配置等)。
Workflow
工作流程
Step 1: Explore the Codebase
步骤1:探索代码库
Use , glob, and read tools to understand the project:
tree- Read README, package.json/Cargo.toml/pyproject.toml for the project's purpose
- Run (or similar) to see directory structure
tree -L 2 -d - Identify entry points (main files, index files, bin directories)
- Read key files at module boundaries to understand the layers
使用、glob和阅读工具来理解项目:
tree- 阅读README、package.json/Cargo.toml/pyproject.toml了解项目用途
- 运行(或类似命令)查看目录结构
tree -L 2 -d - 确定入口点(主文件、索引文件、bin目录)
- 阅读模块边界处的关键文件,理解层级结构
Step 2: Identify the Architecture
步骤2:梳理架构
Map out:
- The problem being solved -- What does this project do? What's the input/output?
- Coarse-grained modules -- What does each top-level directory/package do?
- Data flow -- How does data move through the system? Input -> ??? -> Output
- API boundaries -- Which modules are public interfaces vs internal implementation?
- Architectural invariants -- What rules are enforced by structure? What's deliberately absent?
- Cross-cutting concerns -- Error handling, testing strategy, configuration, observability
梳理以下内容:
- 要解决的问题 —— 该项目的功能是什么?输入和输出分别是什么?
- 粗粒度模块 —— 每个顶级目录/包的作用是什么?
- 数据流 —— 数据如何在系统中流转?输入 -> ??? -> 输出
- API边界 —— 哪些模块是公共接口,哪些是内部实现?
- 架构不变量 —— 结构上强制执行了哪些规则?刻意省略了什么内容?
- 横切关注点 —— 错误处理、测试策略、配置、可观测性
Step 3: Write the ARCHITECTURE.md
步骤3:编写ARCHITECTURE.md
Follow the template below. Keep the total document under ~300 lines for most projects.
遵循下方模板进行编写。大多数项目的文档总长度应控制在约300行以内。
Template
模板
markdown
undefinedmarkdown
undefinedArchitecture
架构
[One paragraph: what this project does at the highest level. What problem it solves.]
[一段文字:说明该项目的最高层级功能,以及它解决的问题。]
Bird's Eye View
全局总览
[How data flows through the system at the coarsest level.
Input -> Processing stages -> Output.
Keep this to 1-3 paragraphs.]
[描述系统最粗粒度的数据流:输入 -> 处理阶段 -> 输出。
控制在1-3段文字。]
Code Map
代码地图
[Brief intro: "This section describes the high-level structure of the codebase.
Pay attention to Boundary and Invariant callouts."]
[简短介绍:「本节描述代码库的高层级结构。
请注意边界和不变量的标注。」]
path/to/module-a/
path/to/module-a/path/to/module-a/
path/to/module-a/[What this module does in 1-3 sentences. Key types: , .]
ImportantTypeAnotherTypeBoundary: [If this is an API boundary, say so and what it means.]
Invariant: [What's deliberately absent or enforced. E.g., "This module never does I/O"
or "Nothing here depends on the HTTP layer."]
[用1-3句话说明该模块的作用。关键类型:、。]
ImportantTypeAnotherType边界: [如果这是一个API边界,请说明这一点及其含义。]
不变量: [刻意省略或强制执行的内容。例如:「该模块从不执行I/O操作」
或「此处不依赖HTTP层」。]
path/to/module-b/
path/to/module-b/path/to/module-b/
path/to/module-b/[Repeat for each significant module.]
[为每个重要模块重复上述内容。]
path/to/module-c/
path/to/module-c/path/to/module-c/
path/to/module-c/[...]
[...]
Cross-Cutting Concerns
横切关注点
Error Handling
错误处理
[How errors are handled across the codebase. Is it Result-based? Exceptions?
Do errors propagate or get caught at boundaries?]
[说明整个代码库的错误处理方式。是基于Result的?还是使用异常?
错误是向上传播还是在边界处捕获?]
Testing
测试
[Testing strategy. Where do tests live? What kinds of tests exist?
What are the important test boundaries?]
[测试策略。测试用例存放在哪里?有哪些类型的测试?
重要的测试边界是什么?]
[Other concerns as applicable]
[其他相关关注点]
[Configuration, observability/logging, code generation, build system, etc.
Only include sections that are genuinely cross-cutting.]
undefined[配置、可观测性/日志、代码生成、构建系统等。
仅包含真正属于横切性质的部分。]
undefinedRules
规则
What to Include
需包含的内容
- Directory/module purposes (1-3 sentences each)
- Names of important types, traits, interfaces, functions (for symbol search)
- API boundaries between layers
- Architectural invariants -- especially things that are deliberately absent
- Data flow at the system level
- Cross-cutting concerns that affect multiple modules
- 目录/模块的用途(每个1-3句话)
- 重要类型、 trait、接口、函数的名称(用于符号搜索)
- 各层级之间的API边界
- 架构不变量 —— 尤其是那些刻意省略的内容
- 系统级别的数据流
- 影响多个模块的横切关注点
What to Omit
需省略的内容
- Implementation details of how individual modules work (that's inline doc)
- Links to specific files or lines (they go stale)
- Anything that changes with routine PRs
- Exhaustive API documentation (that's rustdoc/typedoc/javadoc territory)
- Setup instructions (that's README)
- Contribution guidelines (that's CONTRIBUTING.md)
- 单个模块的实现细节(这是内联文档的职责)
- 指向特定文件或行的链接(会失效)
- 常规PR中会变更的内容
- 详尽的API文档(这是rustdoc/typedoc/javadoc的范畴)
- 安装设置说明(这是README的职责)
- 贡献指南(这是CONTRIBUTING.md的职责)
Style Rules
格式规则
- Use path/to/module/`` headers with backtick-quoted paths for the codemap
### \ - Use Boundary: and Invariant: prefixed callouts (bold label, not blockquotes)
- Keep module descriptions to 1-3 sentences
- Name types in backticks: "Key types: ,
FooBar"BazQux - Write in present tense, active voice
- Prefer concrete over abstract: "parses CLI arguments" not "handles input processing"
- 代码地图部分使用path/to/module/``格式的标题,路径用反引号包裹
### \ - 使用**边界:和不变量:**作为前缀的标注(加粗标签,而非块引用)
- 模块描述控制在1-3句话以内
- 类型名称用反引号包裹:「关键类型:、
FooBar」BazQux - 使用现在时态、主动语态
- 优先使用具体表述:「解析CLI参数」而非「处理输入」
Quality Checklist
质量检查清单
Before finishing, verify:
- Can a newcomer find "the thing that does X" using only this doc?
- Are API boundaries clearly marked?
- Are architectural invariants (especially absences) called out?
- Is every section stable enough to survive 6 months without update?
- Are important types/modules named (not linked)?
- Is there a bird's eye view before the codemap?
- Are cross-cutting concerns addressed?
- Does the codemap order match the data flow or dependency direction?
- Is it under ~300 lines? (Shorter = more likely to be read and maintained)
完成前,请验证以下内容:
- 新成员仅通过这份文档能否找到「实现X功能的代码」?
- API边界是否标记清晰?
- 架构不变量(尤其是省略的内容)是否明确标注?
- 每个部分是否足够稳定,能够在6个月内无需更新?
- 是否已列出重要的类型/模块名称(而非链接)?
- 代码地图之前是否有全局总览?
- 横切关注点是否已覆盖?
- 代码地图的顺序是否与数据流或依赖方向一致?
- 文档长度是否控制在约300行以内?(越短,被阅读和维护的可能性越高)
Reference Example
参考示例
See references/example.md for a complete example ARCHITECTURE.md
for a hypothetical TypeScript project, demonstrating all the patterns above.
查看references/example.md获取完整的ARCHITECTURE.md示例,该示例针对一个假设的TypeScript项目,展示了上述所有模式。