literate-programming

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Literate Programming Skill

文学编程技能

CRITICAL: This skill MUST be activated BEFORE making any changes to .nw files!

You are an expert in literate programming using the noweb system.

重要提示：在对.nw文件进行任何修改之前，必须激活此技能！

您是使用noweb系统进行文学编程的专家。

Reference Files

参考文件

This skill includes detailed references in

references/

File	Content	Search patterns
`noweb-commands.md`	Tangling, weaving, flags, troubleshooting	`notangle` , `noweave` , `-R` , `-L`
`testing-patterns.md`	Test organization, placement, dependency testing	`test functions` , `pytest` , `after implementation`
`git-workflow.md`	Version control, .gitignore, pre-commit	`git` , `commit` , `generated files`
`multi-directory-projects.md`	Large project organization, makefiles	`src/` , `doc/` , `tests/` , `MODULES`
`project-initialization.md`	New project setup, templates, checklist	`new project` , `initialize` , `pyproject.toml`
`preamble.tex`	Standard LaTeX preamble for documentation	`\usepackage` , `memoir`

本技能在

references/

目录中包含详细参考资料：

文件	内容	搜索关键词
`noweb-commands.md`	tangle（代码提取）、weave（文档生成）、参数选项、故障排查	`notangle` , `noweave` , `-R` , `-L`
`testing-patterns.md`	测试组织、测试位置、依赖测试	`test functions` , `pytest` , `after implementation`
`git-workflow.md`	版本控制、.gitignore、提交前检查	`git` , `commit` , `generated files`
`multi-directory-projects.md`	大型项目组织、Makefile	`src/` , `doc/` , `tests/` , `MODULES`
`project-initialization.md`	新项目初始化、模板、检查清单	`new project` , `initialize` , `pyproject.toml`
`preamble.tex`	文档用标准LaTeX前置代码	`\usepackage` , `memoir`

When to Use This Skill

何时使用此技能

Correct Workflow

正确工作流程

User asks to modify a .nw file
YOU ACTIVATE THIS SKILL IMMEDIATELY
You plan the changes with literate programming principles
You make the changes following the principles
You regenerate code with make/notangle

用户要求修改.nw文件
立即激活此技能
遵循文学编程原则规划修改内容
按照原则进行修改
使用make/notangle重新生成代码

Anti-pattern (NEVER do this)

错误模式（绝对禁止）

User asks to modify a .nw file
You directly edit the .nw file ← WRONG
Later review finds literate quality problems
You have to redo everything

用户要求修改.nw文件
直接编辑.nw文件 ← 错误
后续审核发现文学编程质量问题
不得不全部返工

Remember

注意事项

.nw files are NOT regular source code files
They combine documentation and code for human readers
Literate quality is AS IMPORTANT as code correctness
Bad literate quality = failed task, even if code works

.nw文件不是常规源代码文件
它们将文档和代码结合，面向人类读者
文学编程质量与代码正确性同等重要
即使代码能运行，若文学编程质量差，任务仍视为失败

Planning Changes

规划修改内容

When making changes to a .nw file:

Read the existing file to understand structure and narrative
Plan with literate programming in mind:
- What is the "why" behind this change?
- How does this fit into the existing narrative?
- What new chunks are needed? What are their meaningful names?
- Where in the pedagogical order should this be explained?
Design documentation BEFORE writing code:
- Write prose explaining the problem and solution
- Use subsections to structure complex explanations
Decompose code into well-named chunks:
- Each chunk = one coherent concept
- Names describe purpose, not syntax (like pseudocode)
Write the code chunks
Regenerate and test

Key principle: If you find yourself writing code comments to explain logic, that explanation belongs in the documentation chunks instead.

修改.nw文件时：

阅读现有文件，理解其结构和叙述逻辑
结合文学编程原则规划：
- 此次修改的“原因”是什么？
- 它如何融入现有叙述逻辑？
- 需要新增哪些代码块？它们的命名要具备什么含义？
- 应在教学顺序的哪个位置进行解释？
先编写文档，再添加代码：
- 撰写解释问题和解决方案的文字
- 使用小节来组织复杂的解释内容
将代码分解为命名清晰的代码块：
- 每个代码块对应一个连贯的概念
- 命名描述功能目的，而非语法（类似伪代码）
编写代码块
重新生成并测试

核心原则：如果您发现自己需要写代码注释来解释逻辑，那这段解释应该放在文档块中，而非代码里。

Reviewing Literate Programs

审核文学编程代码

When reviewing, evaluate:

Narrative flow: Coherent story? Pedagogical order?
Variation theory: Contrasts used? "Whole, parts, whole" structure?
Chunk quality: Meaningful names? Focused on single concepts?
Explanation quality: Explains "why" not just "what"? Red flags: prose that begins "We [verb] the [noun]" matching a function name; prose that describes parameter types visible in the signature; prose that restates conditionals without explaining why they matter.
Test organization: Tests after implementation, not before?
Proper noweb syntax:
```
[[code]]
```
notation? Valid chunk references?

审核时，需评估以下方面：

叙述流畅性：是否有连贯的逻辑？是否遵循教学顺序？
变异理论应用：是否使用了对比？是否遵循“整体-部分-整体”的结构？
代码块质量：命名是否有意义？是否聚焦单一概念？
解释质量：是否解释了“原因”而非仅描述“内容”？警示信号：文字以“We [动词] the [名词]”开头，且动词与函数名匹配；文字描述函数签名中已可见的参数类型；文字重述条件逻辑但未解释其重要性。
测试组织：测试是否在实现之后编写，而非之前？
noweb语法正确性：是否使用
```
[[code]]
```
标记？代码块引用是否有效？

Core Philosophy

核心理念

Literate programming (Knuth) has two goals:

Explain to human beings what we want a computer to do
Present concepts in order best for human understanding (psychological order, not compiler order)

Knuth提出的文学编程有两个目标：

向人类解释我们希望计算机完成的任务
以最适合人类理解的顺序呈现概念（心理顺序，而非编译器顺序）

Variation Theory

变异理论

Apply

variation-theory

skill when structuring explanations:

Contrast: Show what something IS vs what it is NOT
Separation: Start with whole (module outline), then parts (chunks)
Generalization: Show pattern across different contexts
Fusion: Integrate parts back into coherent whole

CRITICAL: Show concrete examples FIRST, then state general principles. Readers cannot discern a pattern without first experiencing variation.

组织解释内容时，应用

variation-theory

技能：

对比：展示事物“是”什么和“不是”什么
拆分：从整体（模块大纲）开始，再到部分（代码块）
泛化：展示不同场景下的模式
融合：将各部分重新整合为连贯的整体

重要提示：先展示具体示例，再阐述通用原则。读者若未先接触变异案例，无法识别模式。

Noweb File Format

Noweb文件格式

Documentation Chunks

文档块

Begin with
```
@
```
followed by space or newline
Contain explanatory text (LaTeX, Markdown, etc.)
Copied verbatim by noweave

以
```
@
```
开头，后跟空格或换行
包含解释性文本（LaTeX、Markdown等）
noweave会原样复制这些内容

Code Chunks

代码块

Begin with
```
<<chunk name>>=
```
on a line by itself (column 1)
End when another chunk begins or at end of file
Reference other chunks using
```
<<chunk name>>
```
Multiple chunks with same name are concatenated

以单独一行的
```
<<chunk name>>=
```
开头（位于第1列）
当另一个代码块开始或文件结束时终止
使用
```
<<chunk name>>
```
引用其他代码块
同名的多个代码块会被拼接

Syntax Rules

语法规则

Quote code in documentation using
```
[[code]]
```
(escapes LaTeX special chars)
Escape:
```
@<<
```
for literal
```
<<
```
,
```
@@
```
in column 1 for literal
```
@
```

在文档中引用代码时使用
```
[[code]]
```
（转义LaTeX特殊字符）
转义方式：用
```
@<<
```
表示字面量
```
<<
```
，用第1列的
```
@@
```
表示字面量
```
@
```

Writing Guidelines

编写指南

Start with the human story - problem, approach, design decisions
Introduce concepts in pedagogical order - not compiler order
Use meaningful chunk names - 2-5 word summary of purpose (like pseudocode)
Reference variables in chunk names - when a chunk operates on a specific variable, use
```
[[variable]]
```
notation in the chunk name to make the connection explicit (e.g.,
```
<<add graders to [[graders]] list>>
```
)
Decompose by concept, not syntax
Explain the "why" - don't just describe what the code does. Prose that merely restates the code in English teaches nothing. Good prose explains why a design choice was made: what alternative was rejected, what would break without this approach, or what constraint drives the implementation.

Self-test: If your prose could be mechanically generated from the function signature, it's "what" not "why." Ask yourself: What design decision does this paragraph justify? What alternative did we reject and why? If the paragraph doesn't answer either question, rewrite it.

BAD — prose restates code in English:
noweb
```
\subsection{Counting $n$-grams}

We count overlapping $n$-grams.
If $n$ is larger than the input, the result is empty.

<<functions>>=
def ngram_counts(text, *, n):
    ...
@
```
GOOD — prose explains why this design choice:
noweb
```
\subsection{Counting $n$-grams}

We use overlapping $n$-grams because they capture all positional
contexts---in \enquote{THE}, overlapping bigrams yield TH and HE,
whereas non-overlapping would only yield TH.  This matches the
standard definition used in cryptanalysis.

<<functions>>=
def ngram_counts(text, *, n):
    ...
@
```
Red flags that prose is "what" not "why":
- Begins "We [verb] the [noun]" where the verb matches a function name
- Describes parameter types or return values already in the signature
- Restates conditional logic ("If X, we do Y") without explaining why X matters

Keep chunks focused — one function per
<<functions>>=
chunk with prose before it. Each function (or small group of tightly related functions) gets its own

<<functions>>=

chunk preceded by explanatory prose. Never put multiple unrelated functions in a single chunk.

BAD — four functions crammed into one chunk with minimal prose:

noweb

\subsection{Helper Functions}

We provide several utility functions.

<<functions>>=
def normalize_text(text): ...

def letters_only(text): ...

def key_shifts(key): ...

def index_of_coincidence(text): ...
@

GOOD — each function with its own subsection and prose:

noweb

\subsection{Text Normalization}

Before analysis, we strip non-alphabetic characters and
convert to lowercase so that frequency counts are meaningful.

<<functions>>=
def normalize_text(text): ...
@

\subsection{Index of Coincidence}

The index of coincidence measures how likely two randomly
chosen letters from a text are identical ...

<<functions>>=
def index_of_coincidence(text): ...
@

Decompose long functions into named sub-chunks — If a function has more than ~25 lines and contains two or more distinct algorithmic phases, decompose it into named sub-chunks. Each sub-chunk name should read like a step in an algorithm description. The prose before each sub-chunk explains why that phase works the way it does. This is the classic Knuth technique.

BAD — 80-line function with one line of prose:

noweb

We generate plaintext by concatenating sentences.

<<functions>>=
def generate_plaintext(size, *, sources, seed=None):
    """..."""
    if size <= 0:
        raise ValueError(...)
    paragraphs = extract_paragraphs(sources, ...)
    ...  # 75 more lines
    return normalize(prefix, options)
@

GOOD — function body decomposed into named sub-chunks with prose:

noweb

<<functions>>=
def generate_plaintext(size, *, sources, seed=None):
    """..."""
    <<prepare filtered paragraphs>>
    <<pick random starting point>>
    <<collect sentences until target length>>
    <<select closest sentence boundary>>
@

We extract paragraphs from the corpus, removing headings and ToC
entries.  Paragraphs lacking sentence-ending punctuation are
discarded---they are typically list items or table rows.

<<prepare filtered paragraphs>>=
if size <= 0:
    raise ValueError("size must be positive")
...
@

To avoid always starting at the beginning of the corpus, we
rotate to a random paragraph.

<<pick random starting point>>=
rng = random.Random(seed)
...
@

Use bucket chunks — distribute
<<constants>>=
near their relevant code - Define each constant in the section where it is conceptually relevant. Never group all constants into a single

\subsection{Constants}

BAD — all constants dumped in one subsection:

noweb

\subsection{Constants}

<<constants>>=
DATA_DIR = ...        % used in loading section
GUTENBERG_START = ... % used in extraction section
SENTENCE_RE = ...     % used in sentence splitting section
KEEP_PUNCT = ...      % used in normalization section
@

GOOD — each constant near the code that uses it:

noweb

\subsection{Loading Texts}

<<constants>>=
DATA_DIR = Path(__file__).parent / "data"
@

<<functions>>=
def load_text(path): ...
@

\subsection{Extracting Body Text}

<<constants>>=
GUTENBERG_START = "*** START OF"
GUTENBERG_END = "*** END OF"
@

<<functions>>=
def extract_body(text): ...
@

Define constants for magic numbers - never hardcode values
Co-locate dependencies with features - feature's imports in feature's section
Prefer public functions - Default to making functions public with docstrings. Only use
```
_
```
-prefixed private functions for true internal helpers tightly coupled to a single caller. Public utilities (e.g.,
```
normalize_text
```
,
```
letters_only
```
) are reusable across modules and discoverable via
```
help()
```
. Duplicated private helpers across modules (e.g.,
```
_to_ascii
```
in both
```
vigenere.nw
```
and
```
plaintexts.nw
```
) are a sign the function should be public in a shared module.
Keep lines under 80 characters - both prose and code

从人类视角的故事入手 - 问题、方法、设计决策
以教学顺序引入概念 - 而非编译器顺序
使用有意义的代码块名称 - 2-5个单词的功能摘要（类似伪代码）
在代码块名称中引用变量 - 当代码块操作特定变量时，在名称中使用
```
[[variable]]
```
标记明确关联（例如
```
<<add graders to [[graders]] list>>
```
）
按概念分解，而非语法
解释“原因” - 不要仅描述代码的功能。仅用英文复述代码的文字毫无教学价值。优质的文字应解释为何做出该设计选择：摒弃了哪些替代方案，若不采用此方法会出现什么问题，或哪些约束条件驱动了实现。

自我检验：如果您的文字可以通过函数签名机械生成，那它只是描述“内容”而非解释“原因”。请自问：本段文字论证了哪个设计决策？我们摒弃了哪个替代方案，原因是什么？ 如果无法回答其中任何一个问题，请重写。

反面示例 — 文字仅复述代码：
noweb
```
\subsection{Counting $n$-grams}

We count overlapping $n$-grams.
If $n$ is larger than the input, the result is empty.

<<functions>>=
def ngram_counts(text, *, n):
    ...
@
```
正面示例 — 文字解释设计选择的原因：
noweb
```
\subsection{Counting $n$-grams}

We use overlapping $n$-grams because they capture all positional
contexts---in \enquote{THE}, overlapping bigrams yield TH and HE,
whereas non-overlapping would only yield TH.  This matches the
standard definition used in cryptanalysis.

<<functions>>=
def ngram_counts(text, *, n):
    ...
@
```

保持代码块聚焦 — 每个
<<functions>>=
代码块对应一个函数，前置说明文字。每个函数（或一组紧密相关的小型函数）应有独立的

<<functions>>=

代码块，并在前面添加解释性文字。切勿将多个不相关函数放在同一个代码块中。

反面示例 — 四个函数挤在一个代码块中，仅配少量文字：

noweb

\subsection{Helper Functions}

We provide several utility functions.

<<functions>>=
def normalize_text(text): ...

def letters_only(text): ...

def key_shifts(key): ...

def index_of_coincidence(text): ...
@

正面示例 — 每个函数配独立小节和说明文字：

noweb

\subsection{Text Normalization}

Before analysis, we strip non-alphabetic characters and
convert to lowercase so that frequency counts are meaningful.

<<functions>>=
def normalize_text(text): ...
@

\subsection{Index of Coincidence}

The index of coincidence measures how likely two randomly
chosen letters from a text are identical ...

<<functions>>=
def index_of_coincidence(text): ...
@

将长函数分解为命名子代码块 — 如果函数超过约25行，且包含两个或更多不同的算法阶段，请将其分解为命名子代码块。每个子代码块的名称应像算法描述中的步骤。每个子代码块前的文字解释该阶段为何采用此实现方式。这是经典的Knuth技巧。

反面示例 — 80行函数仅配一行文字：

noweb

We generate plaintext by concatenating sentences.

<<functions>>=
def generate_plaintext(size, *, sources, seed=None):
    """..."""
    if size <= 0:
        raise ValueError(...)
    paragraphs = extract_paragraphs(sources, ...)
    ...  # 75 more lines
    return normalize(prefix, options)
@

正面示例 — 函数体分解为带说明文字的命名子代码块：

noweb

<<functions>>=
def generate_plaintext(size, *, sources, seed=None):
    """..."""
    <<prepare filtered paragraphs>>
    <<pick random starting point>>
    <<collect sentences until target length>>
    <<select closest sentence boundary>>
@

We extract paragraphs from the corpus, removing headings and ToC
entries.  Paragraphs lacking sentence-ending punctuation are
discarded---they are typically list items or table rows.

<<prepare filtered paragraphs>>=
if size <= 0:
    raise ValueError("size must be positive")
...
@

To avoid always starting at the beginning of the corpus, we
rotate to a random paragraph.

<<pick random starting point>>=
rng = random.Random(seed)
...
@

使用桶式代码块 — 将
<<constants>>=
分散到相关代码附近 - 在概念相关的小节中定义每个常量。切勿将所有常量集中到一个

\subsection{Constants}

小节中。

反面示例 — 所有常量集中在一个小节：

noweb

\subsection{Constants}

<<constants>>=
DATA_DIR = ...        % used in loading section
GUTENBERG_START = ... % used in extraction section
SENTENCE_RE = ...     % used in sentence splitting section
KEEP_PUNCT = ...      % used in normalization section
@

正面示例 — 每个常量靠近使用它的代码：

noweb

\subsection{Loading Texts}

<<constants>>=
DATA_DIR = Path(__file__).parent / "data"
@

<<functions>>=
def load_text(path): ...
@

\subsection{Extracting Body Text}

<<constants>>=
GUTENBERG_START = "*** START OF"
GUTENBERG_END = "*** END OF"
@

<<functions>>=
def extract_body(text): ...
@

为魔术数字定义常量 - 切勿硬编码数值
将依赖与功能放在一起 - 功能的导入语句放在对应功能的小节中
优先使用公共函数 - 默认让所有带文档字符串的函数成为公共函数。仅当函数是与单个调用者紧密耦合的真正内部助手时，才使用
```
_
```
前缀的私有函数。公共工具（如
```
normalize_text
```
,
```
letters_only
```
）可跨模块复用，且可通过
```
help()
```
发现。模块间重复的私有助手（如
```
vigenere.nw
```
和
```
plaintexts.nw
```
中的
```
_to_ascii
```
）表明该函数应在共享模块中设为公共函数。
行宽控制在80字符以内 - 文字和代码均需遵守

LaTeX Documentation Quality

LaTeX文档质量

Apply

latex-writing

skill. Most common anti-patterns in .nw files:

Lists with bold labels: Use

\begin{description}

with

\item[Label]

, NOT

\begin{itemize}

with

\item \textbf{Label}:

Code with manual escaping: Use

[[code]]

, NOT

\texttt{...\_...}

Manual quotes: Use

\enquote{...}

, NOT

"..."

...'' ``

Manual cross-references: Use

\cref{...}

, NOT

Section~\ref{...}

应用

latex-writing

技能。.nw文件中最常见的错误模式：

带粗体标签的列表：使用

\begin{description}

配合

\item[Label]

，而非

\begin{itemize}

配合

\item \textbf{Label}:

手动转义的代码：使用

[[code]]

，而非

\texttt{...\_...}

手动引号：使用

\enquote{...}

，而非

"..."

或

...'' ``

手动交叉引用：使用

\cref{...}

，而非

Section~\ref{...}

Progressive Disclosure Pattern

渐进式披露模式

When introducing high-level structure, use abstract placeholder chunks that defer specifics:

noweb

def cli_show(user_regex,
             <<options for filtering>>):
  <<implementation>>
@

[... later, explain each option ...]

\paragraph{The --all option}
<<options for filtering>>=
all: Annotated[bool, all_opt] = False,
@

Benefits: readable high-level structure, pedagogical ordering, maintainability.

The same technique applies to function bodies: long functions can use

<<phase name>>

sub-chunks to present algorithmic steps in pedagogical order with prose between them (see Writing Guideline 8, "Decompose long functions").

引入高层结构时，使用抽象占位符代码块延迟细节说明：

noweb

def cli_show(user_regex,
             <<options for filtering>>):
  <<implementation>>
@

[... 后续解释每个选项 ...]

\paragraph{The --all option}
<<options for filtering>>=
all: Annotated[bool, all_opt] = False,
@

优势：高层结构可读性强、符合教学顺序、易于维护。

同一技术也适用于函数体：长函数可使用

<<phase name>>

子代码块，按教学顺序呈现算法步骤，中间穿插说明文字（见编写指南第8条“分解长函数”）。

Chunk Concatenation Patterns

代码块拼接模式

Use multiple definitions when building up a parameter list pedagogically:

noweb

\subsection{Adding the diff flag}
<<args for diff>>=
diff=args.diff,
@

[... later ...]

\subsection{Fine-tuning thresholds}
<<args for diff>>=
threshold=args.threshold
@

Use separate chunks when contexts differ (different scopes):

noweb

<<args from command line>>=  # Has args object
diff=args.diff,
@

<<params for recursion>>=    # No args, only parameters
diff=diff,
@

多次定义以教学式构建参数列表：

noweb

\subsection{Adding the diff flag}
<<args for diff>>=
diff=args.diff,
@

[... 后续 ...]

\subsection{Fine-tuning thresholds}
<<args for diff>>=
threshold=args.threshold
@

使用独立代码块区分不同上下文（不同作用域）：

noweb

<<args from command line>>=  # 有args对象
diff=args.diff,
@

<<params for recursion>>=    # 无args，仅参数
diff=diff,
@

Test Organization

测试组织

CRITICAL: Tests MUST appear AFTER implementation, distributed throughout the file near the code they verify. NEVER create a

\section{Tests}

\section{Unit Tests}

that groups all tests at the end of the file.

See

references/testing-patterns.md

for detailed patterns.

Key rules:

Each implementation section is followed by its
```
<<test functions>>=
```
chunk
Use single
```
<<test functions>>
```
chunk name — noweb concatenates them
Use
```
from module import *
```
in the test file header
Frame tests pedagogically: "Let's verify this works..."

BAD — all tests collected at the end:

noweb

\section{Encryption}
<<functions>>=
def encrypt(text, key): ...
@

\section{Decryption}
<<functions>>=
def decrypt(text, key): ...
@

\section{Tests}          % ← NEVER do this

<<test functions>>=
def test_encrypt(): ...
def test_decrypt(): ...
@

GOOD — each test immediately after its implementation:

noweb

\section{Encryption}
<<functions>>=
def encrypt(text, key): ...
@

Let's verify that encryption produces the expected ciphertext:

<<test functions>>=
def test_encrypt(): ...
@

\section{Decryption}
<<functions>>=
def decrypt(text, key): ...
@

We can verify that decryption inverts encryption:

<<test functions>>=
def test_decrypt(): ...
@

重要提示：测试必须在实现之后编写，并分散在文件中靠近对应代码的位置。绝对禁止创建

\section{Tests}

或

\section{Unit Tests}

将所有测试集中在文件末尾。

详见

references/testing-patterns.md

中的详细模式。

核心规则：

每个实现小节后紧跟其
```
<<test functions>>=
```
代码块
使用单个
```
<<test functions>>
```
代码块名称 — noweb会自动拼接它们
在测试文件头部使用
```
from module import *
```
以教学方式组织测试：“让我们验证此功能是否正常工作...”

反面示例 — 所有测试集中在末尾：

noweb

\section{Encryption}
<<functions>>=
def encrypt(text, key): ...
@

\section{Decryption}
<<functions>>=
def decrypt(text, key): ...
@

\section{Tests}          % ← 绝对禁止

<<test functions>>=
def test_encrypt(): ...
def test_decrypt(): ...
@

正面示例 — 每个测试紧跟其实现：

noweb

\section{Encryption}
<<functions>>=
def encrypt(text, key): ...
@

让我们验证加密是否生成预期的密文：

<<test functions>>=
def test_encrypt(): ...
@

\section{Decryption}
<<functions>>=
def decrypt(text, key): ...
@

我们可以验证解密是否能还原加密内容：

<<test functions>>=
def test_decrypt(): ...
@

Multi-Directory Projects

多目录项目

For large projects (5+ .nw files), see

references/multi-directory-projects.md

Key structure:

project/
├── Makefile       # Root orchestrator (compile → test → docs)
├── pyproject.toml # Poetry packaging configuration
├── src/           # .nw files → .py + .tex
├── doc/           # Document wrapper (.nw), preamble.tex
├── tests/         # Extracted test files (unit/ subdir)
└── makefiles/     # Shared build rules (noweb.mk, subdir.mk)

对于大型项目（5个以上.nw文件），详见

references/multi-directory-projects.md

。

核心结构：

project/
├── Makefile       # 根编排文件（编译 → 测试 → 文档）
├── pyproject.toml # Poetry打包配置
├── src/           # .nw文件 → .py + .tex
├── doc/           # 文档包装器(.nw)、preamble.tex
├── tests/         # 提取的测试文件（unit/子目录）
└── makefiles/     # 共享构建规则（noweb.mk, subdir.mk）

Initializing a New Project

初始化新项目

See

references/project-initialization.md

for full details. Quick checklist:

Create
```
pyproject.toml
```
with
```
[tool.poetry]
```
packages/include/exclude

Create

src/.gitignore

(

*.py

*.tex

) and

tests/.gitignore

(

*.py

)

Create
```
src/packagename/Makefile
```
with explicit
```
__init__.py
```
rule

Create

src/packagename/packagename.nw

with

<<[[__init__.py]]>>

and

<<test [[packagename.py]]>>

chunks

Create
```
tests/Makefile
```
with auto-discovery (uses
```
%20
```
encoding,
```
cpif
```
,
```
unit/
```
subdirectory)

Create

doc/packagename.nw

wrapper,

doc/Makefile

doc/preamble.tex

Create root
```
Makefile
```
orchestrating compile → test → docs

详见

references/project-initialization.md

的完整说明。快速检查清单：

创建带
```
[tool.poetry]
```
包/包含/排除配置的
```
pyproject.toml
```

创建

src/.gitignore

（

*.py

*.tex

）和

tests/.gitignore

（

*.py

）

创建带显式
```
__init__.py
```
规则的
```
src/packagename/Makefile
```

创建带

<<[[__init__.py]]>>

和

<<test [[packagename.py]]>>

代码块的

src/packagename/packagename.nw

创建带自动发现功能的
```
tests/Makefile
```
（使用
```
%20
```
编码、
```
cpif
```
、
```
unit/
```
子目录）

创建

doc/packagename.nw

包装器、

doc/Makefile

、

doc/preamble.tex

创建根
```
Makefile
```
，编排编译 → 测试 → 文档的流程

LaTeX-Safe Chunk Names

LaTeX安全的代码块名称

Use

[[...]]

notation for Python chunks with underscores:

noweb

<<[[module_name.py]]>>=
def my_function():
    pass
@

Extract with:

notangle -R"[[module_name.py]]" file.nw > module_name.py

对带下划线的Python代码块使用

[[...]]

标记：

noweb

<<[[module_name.py]]>>=
def my_function():
    pass
@

提取命令：

notangle -R"[[module_name.py]]" file.nw > module_name.py

Best Practices Summary

最佳实践总结

Write documentation first - then add code
Keep lines under 80 characters
Check for unused chunks - run
```
noroots
```
to find typos
Keep tangled code in .gitignore - .nw is source of truth
NEVER commit generated files - .py and .tex from .nw are build artifacts
Test your tangles - ensure extracted code runs

Require PEP-257 docstrings on all public functions - Prose in

.nw

is for maintainers reading the literate source; docstrings are for users of the compiled

.py

who never see the

.nw

file. Both are needed. Private functions (prefixed

) may omit docstrings. Never use

\cref

or other LaTeX commands inside docstrings.

BAD — function with prose but no docstring:

noweb

We convert text to lowercase ASCII for uniform comparison.

<<functions>>=
def normalize_text(text):
    return text.lower().encode("ascii", "ignore").decode()
@

GOOD — prose for maintainers AND docstring for users:

noweb

We convert text to lowercase ASCII for uniform comparison.

<<functions>>=
def normalize_text(text):
    """Return lowercase ASCII version of ``text``.

    Non-ASCII characters are silently dropped.
    """
    return text.lower().encode("ascii", "ignore").decode()
@

Include table of contents - add
```
\tableofcontents
```
in documentation

先写文档 - 再添加代码
行宽控制在80字符以内
检查未使用的代码块 - 运行
```
noroots
```
查找拼写错误
将生成的代码加入.gitignore - .nw是唯一可信源
绝对禁止提交生成的文件 - 从.nw生成的.py和.tex是构建产物
测试代码提取结果 - 确保提取的代码可运行
要求所有公共函数符合PEP-257文档字符串规范 - .nw中的文字是给维护者阅读的文学源码；文档字符串是给用户阅读的编译后
```
.py
```
文件的说明。两者缺一不可。私有函数（带
```
_
```
前缀）可省略文档字符串。切勿在文档字符串中使用
```
\cref
```
或其他LaTeX命令。
反面示例 — 仅带文字说明但无文档字符串的函数：
noweb
```
We convert text to lowercase ASCII for uniform comparison.

<<functions>>=
def normalize_text(text):
    return text.lower().encode("ascii", "ignore").decode()
@
```
正面示例 — 同时为维护者提供文字说明、为用户提供文档字符串：
noweb
```
We convert text to lowercase ASCII for uniform comparison.

<<functions>>=
def normalize_text(text):
    """Return lowercase ASCII version of ``text``.

    Non-ASCII characters are silently dropped.
    """
    return text.lower().encode("ascii", "ignore").decode()
@
```
包含目录 - 在文档中添加
```
\tableofcontents
```

Git Workflow

Git工作流

See

references/git-workflow.md

for details.

Core rules:

Only commit .nw files to git
Add generated files to .gitignore immediately
Regenerate code with
```
make
```
after checkout/pull
Never commit generated .py or .tex files

详见

references/git-workflow.md

的说明。

核心规则：

仅将.nw文件提交到git
立即将生成的文件加入.gitignore
检出/拉取代码后使用
```
make
```
重新生成代码
绝对禁止提交生成的.py或.tex文件

Noweb Commands Quick Reference

Noweb命令速查

See

references/noweb-commands.md

for details.

bash

undefined

详见

references/noweb-commands.md

的说明。

bash

undefined

Tangling

代码提取（tangle）

notangle -R"[[module.py]]" file.nw > module.py noroots file.nw # List root chunks

notangle -R"[[module.py]]" file.nw > module.py noroots file.nw # 列出根代码块

Weaving

文档生成（weave）

noweave -n -delay -x -t2 file.nw > file.tex # For inclusion noweave -latex -x file.nw > file.tex # Standalone

undefined

noweave -n -delay -x -t2 file.nw > file.tex # 用于嵌入的文档 noweave -latex -x file.nw > file.tex # 独立文档

undefined

When Literate Programming Is Valuable

文学编程的适用场景

Complex algorithms requiring detailed explanation
Educational code where understanding is paramount
Code maintained by others
Programs where design decisions need documentation
Projects combining multiple languages/tools

需要详细解释的复杂算法
以理解为首要目标的教学代码
由他人维护的代码
设计决策需要文档记录的程序
结合多种语言/工具的项目