literate-programming

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Literate Programming Skill

文学编程技能

CRITICAL: This skill MUST be activated BEFORE making any changes to .nw files!
You are an expert in literate programming using the noweb system.
重要提示:在对.nw文件进行任何修改之前,必须激活此技能!
您是使用noweb系统进行文学编程的专家。

Reference Files

参考文件

This skill includes detailed references in
references/
:
FileContentSearch patterns
noweb-commands.md
Tangling, weaving, flags, troubleshooting
notangle
,
noweave
,
-R
,
-L
testing-patterns.md
Test organization, placement, dependency testing
test functions
,
pytest
,
after implementation
git-workflow.md
Version control, .gitignore, pre-commit
git
,
commit
,
generated files
multi-directory-projects.md
Large project organization, makefiles
src/
,
doc/
,
tests/
,
MODULES
project-initialization.md
New project setup, templates, checklist
new project
,
initialize
,
pyproject.toml
preamble.tex
Standard LaTeX preamble for documentation
\usepackage
,
memoir
本技能在
references/
目录中包含详细参考资料:
文件内容搜索关键词
noweb-commands.md
tangle(代码提取)、weave(文档生成)、参数选项、故障排查
notangle
,
noweave
,
-R
,
-L
testing-patterns.md
测试组织、测试位置、依赖测试
test functions
,
pytest
,
after implementation
git-workflow.md
版本控制、.gitignore、提交前检查
git
,
commit
,
generated files
multi-directory-projects.md
大型项目组织、Makefile
src/
,
doc/
,
tests/
,
MODULES
project-initialization.md
新项目初始化、模板、检查清单
new project
,
initialize
,
pyproject.toml
preamble.tex
文档用标准LaTeX前置代码
\usepackage
,
memoir

When to Use This Skill

何时使用此技能

Correct Workflow

正确工作流程

  1. User asks to modify a .nw file
  2. YOU ACTIVATE THIS SKILL IMMEDIATELY
  3. You plan the changes with literate programming principles
  4. You make the changes following the principles
  5. You regenerate code with make/notangle
  1. 用户要求修改.nw文件
  2. 立即激活此技能
  3. 遵循文学编程原则规划修改内容
  4. 按照原则进行修改
  5. 使用make/notangle重新生成代码

Anti-pattern (NEVER do this)

错误模式(绝对禁止)

  1. User asks to modify a .nw file
  2. You directly edit the .nw file ← WRONG
  3. Later review finds literate quality problems
  4. You have to redo everything
  1. 用户要求修改.nw文件
  2. 直接编辑.nw文件 ← 错误
  3. 后续审核发现文学编程质量问题
  4. 不得不全部返工

Remember

注意事项

  • .nw files are NOT regular source code files
  • They combine documentation and code for human readers
  • Literate quality is AS IMPORTANT as code correctness
  • Bad literate quality = failed task, even if code works
  • .nw文件不是常规源代码文件
  • 它们将文档和代码结合,面向人类读者
  • 文学编程质量与代码正确性同等重要
  • 即使代码能运行,若文学编程质量差,任务仍视为失败

Planning Changes

规划修改内容

When making changes to a .nw file:
  1. Read the existing file to understand structure and narrative
  2. Plan with literate programming in mind:
    • What is the "why" behind this change?
    • How does this fit into the existing narrative?
    • What new chunks are needed? What are their meaningful names?
    • Where in the pedagogical order should this be explained?
  3. Design documentation BEFORE writing code:
    • Write prose explaining the problem and solution
    • Use subsections to structure complex explanations
  4. Decompose code into well-named chunks:
    • Each chunk = one coherent concept
    • Names describe purpose, not syntax (like pseudocode)
  5. Write the code chunks
  6. Regenerate and test
Key principle: If you find yourself writing code comments to explain logic, that explanation belongs in the documentation chunks instead.
修改.nw文件时:
  1. 阅读现有文件,理解其结构和叙述逻辑
  2. 结合文学编程原则规划
    • 此次修改的“原因”是什么?
    • 它如何融入现有叙述逻辑?
    • 需要新增哪些代码块?它们的命名要具备什么含义?
    • 应在教学顺序的哪个位置进行解释?
  3. 先编写文档,再添加代码
    • 撰写解释问题和解决方案的文字
    • 使用小节来组织复杂的解释内容
  4. 将代码分解为命名清晰的代码块
    • 每个代码块对应一个连贯的概念
    • 命名描述功能目的,而非语法(类似伪代码)
  5. 编写代码块
  6. 重新生成并测试
核心原则:如果您发现自己需要写代码注释来解释逻辑,那这段解释应该放在文档块中,而非代码里。

Reviewing Literate Programs

审核文学编程代码

When reviewing, evaluate:
  1. Narrative flow: Coherent story? Pedagogical order?
  2. Variation theory: Contrasts used? "Whole, parts, whole" structure?
  3. Chunk quality: Meaningful names? Focused on single concepts?
  4. Explanation quality: Explains "why" not just "what"? Red flags: prose that begins "We [verb] the [noun]" matching a function name; prose that describes parameter types visible in the signature; prose that restates conditionals without explaining why they matter.
  5. Test organization: Tests after implementation, not before?
  6. Proper noweb syntax:
    [[code]]
    notation? Valid chunk references?
审核时,需评估以下方面:
  1. 叙述流畅性:是否有连贯的逻辑?是否遵循教学顺序?
  2. 变异理论应用:是否使用了对比?是否遵循“整体-部分-整体”的结构?
  3. 代码块质量:命名是否有意义?是否聚焦单一概念?
  4. 解释质量:是否解释了“原因”而非仅描述“内容”? 警示信号: 文字以“We [动词] the [名词]”开头,且动词与函数名匹配; 文字描述函数签名中已可见的参数类型; 文字重述条件逻辑但未解释其重要性。
  5. 测试组织:测试是否在实现之后编写,而非之前?
  6. noweb语法正确性:是否使用
    [[code]]
    标记?代码块引用是否有效?

Core Philosophy

核心理念

Literate programming (Knuth) has two goals:
  1. Explain to human beings what we want a computer to do
  2. Present concepts in order best for human understanding (psychological order, not compiler order)
Knuth提出的文学编程有两个目标:
  1. 向人类解释我们希望计算机完成的任务
  2. 以最适合人类理解的顺序呈现概念(心理顺序,而非编译器顺序)

Variation Theory

变异理论

Apply
variation-theory
skill when structuring explanations:
  • Contrast: Show what something IS vs what it is NOT
  • Separation: Start with whole (module outline), then parts (chunks)
  • Generalization: Show pattern across different contexts
  • Fusion: Integrate parts back into coherent whole
CRITICAL: Show concrete examples FIRST, then state general principles. Readers cannot discern a pattern without first experiencing variation.
组织解释内容时,应用
variation-theory
技能:
  • 对比:展示事物“是”什么和“不是”什么
  • 拆分:从整体(模块大纲)开始,再到部分(代码块)
  • 泛化:展示不同场景下的模式
  • 融合:将各部分重新整合为连贯的整体
重要提示:先展示具体示例,再阐述通用原则。读者若未先接触变异案例,无法识别模式。

Noweb File Format

Noweb文件格式

Documentation Chunks

文档块

  • Begin with
    @
    followed by space or newline
  • Contain explanatory text (LaTeX, Markdown, etc.)
  • Copied verbatim by noweave
  • @
    开头,后跟空格或换行
  • 包含解释性文本(LaTeX、Markdown等)
  • noweave会原样复制这些内容

Code Chunks

代码块

  • Begin with
    <<chunk name>>=
    on a line by itself (column 1)
  • End when another chunk begins or at end of file
  • Reference other chunks using
    <<chunk name>>
  • Multiple chunks with same name are concatenated
  • 以单独一行的
    <<chunk name>>=
    开头(位于第1列)
  • 当另一个代码块开始或文件结束时终止
  • 使用
    <<chunk name>>
    引用其他代码块
  • 同名的多个代码块会被拼接

Syntax Rules

语法规则

  • Quote code in documentation using
    [[code]]
    (escapes LaTeX special chars)
  • Escape:
    @<<
    for literal
    <<
    ,
    @@
    in column 1 for literal
    @
  • 在文档中引用代码时使用
    [[code]]
    (转义LaTeX特殊字符)
  • 转义方式:用
    @<<
    表示字面量
    <<
    ,用第1列的
    @@
    表示字面量
    @

Writing Guidelines

编写指南

  1. Start with the human story - problem, approach, design decisions
  2. Introduce concepts in pedagogical order - not compiler order
  3. Use meaningful chunk names - 2-5 word summary of purpose (like pseudocode)
  4. Reference variables in chunk names - when a chunk operates on a specific variable, use
    [[variable]]
    notation in the chunk name to make the connection explicit (e.g.,
    <<add graders to [[graders]] list>>
    )
  5. Decompose by concept, not syntax
  6. Explain the "why" - don't just describe what the code does. Prose that merely restates the code in English teaches nothing. Good prose explains why a design choice was made: what alternative was rejected, what would break without this approach, or what constraint drives the implementation.
    Self-test: If your prose could be mechanically generated from the function signature, it's "what" not "why." Ask yourself: What design decision does this paragraph justify? What alternative did we reject and why? If the paragraph doesn't answer either question, rewrite it.
    BAD — prose restates code in English:
    noweb
    \subsection{Counting $n$-grams}
    
    We count overlapping $n$-grams.
    If $n$ is larger than the input, the result is empty.
    
    <<functions>>=
    def ngram_counts(text, *, n):
        ...
    @
    GOOD — prose explains why this design choice:
    noweb
    \subsection{Counting $n$-grams}
    
    We use overlapping $n$-grams because they capture all positional
    contexts---in \enquote{THE}, overlapping bigrams yield TH and HE,
    whereas non-overlapping would only yield TH.  This matches the
    standard definition used in cryptanalysis.
    
    <<functions>>=
    def ngram_counts(text, *, n):
        ...
    @
    Red flags that prose is "what" not "why":
    • Begins "We [verb] the [noun]" where the verb matches a function name
    • Describes parameter types or return values already in the signature
    • Restates conditional logic ("If X, we do Y") without explaining why X matters
  7. Keep chunks focused — one function per
    <<functions>>=
    chunk with prose before it.
    Each function (or small group of tightly related functions) gets its own
    <<functions>>=
    chunk preceded by explanatory prose. Never put multiple unrelated functions in a single chunk.
    BAD — four functions crammed into one chunk with minimal prose:
    noweb
    \subsection{Helper Functions}
    
    We provide several utility functions.
    
    <<functions>>=
    def normalize_text(text): ...
    
    def letters_only(text): ...
    
    def key_shifts(key): ...
    
    def index_of_coincidence(text): ...
    @
    GOOD — each function with its own subsection and prose:
    noweb
    \subsection{Text Normalization}
    
    Before analysis, we strip non-alphabetic characters and
    convert to lowercase so that frequency counts are meaningful.
    
    <<functions>>=
    def normalize_text(text): ...
    @
    
    \subsection{Index of Coincidence}
    
    The index of coincidence measures how likely two randomly
    chosen letters from a text are identical ...
    
    <<functions>>=
    def index_of_coincidence(text): ...
    @
  8. Decompose long functions into named sub-chunks — If a function has more than ~25 lines and contains two or more distinct algorithmic phases, decompose it into named sub-chunks. Each sub-chunk name should read like a step in an algorithm description. The prose before each sub-chunk explains why that phase works the way it does. This is the classic Knuth technique.
    BAD — 80-line function with one line of prose:
    noweb
    We generate plaintext by concatenating sentences.
    
    <<functions>>=
    def generate_plaintext(size, *, sources, seed=None):
        """..."""
        if size <= 0:
            raise ValueError(...)
        paragraphs = extract_paragraphs(sources, ...)
        ...  # 75 more lines
        return normalize(prefix, options)
    @
    GOOD — function body decomposed into named sub-chunks with prose:
    noweb
    <<functions>>=
    def generate_plaintext(size, *, sources, seed=None):
        """..."""
        <<prepare filtered paragraphs>>
        <<pick random starting point>>
        <<collect sentences until target length>>
        <<select closest sentence boundary>>
    @
    
    We extract paragraphs from the corpus, removing headings and ToC
    entries.  Paragraphs lacking sentence-ending punctuation are
    discarded---they are typically list items or table rows.
    
    <<prepare filtered paragraphs>>=
    if size <= 0:
        raise ValueError("size must be positive")
    ...
    @
    
    To avoid always starting at the beginning of the corpus, we
    rotate to a random paragraph.
    
    <<pick random starting point>>=
    rng = random.Random(seed)
    ...
    @
  9. Use bucket chunks — distribute
    <<constants>>=
    near their relevant code
    - Define each constant in the section where it is conceptually relevant. Never group all constants into a single
    \subsection{Constants}
    .
    BAD — all constants dumped in one subsection:
    noweb
    \subsection{Constants}
    
    <<constants>>=
    DATA_DIR = ...        % used in loading section
    GUTENBERG_START = ... % used in extraction section
    SENTENCE_RE = ...     % used in sentence splitting section
    KEEP_PUNCT = ...      % used in normalization section
    @
    GOOD — each constant near the code that uses it:
    noweb
    \subsection{Loading Texts}
    
    <<constants>>=
    DATA_DIR = Path(__file__).parent / "data"
    @
    
    <<functions>>=
    def load_text(path): ...
    @
    
    \subsection{Extracting Body Text}
    
    <<constants>>=
    GUTENBERG_START = "*** START OF"
    GUTENBERG_END = "*** END OF"
    @
    
    <<functions>>=
    def extract_body(text): ...
    @
  10. Define constants for magic numbers - never hardcode values
  11. Co-locate dependencies with features - feature's imports in feature's section
  12. Prefer public functions - Default to making functions public with docstrings. Only use
    _
    -prefixed private functions for true internal helpers tightly coupled to a single caller. Public utilities (e.g.,
    normalize_text
    ,
    letters_only
    ) are reusable across modules and discoverable via
    help()
    . Duplicated private helpers across modules (e.g.,
    _to_ascii
    in both
    vigenere.nw
    and
    plaintexts.nw
    ) are a sign the function should be public in a shared module.
  13. Keep lines under 80 characters - both prose and code
  1. 从人类视角的故事入手 - 问题、方法、设计决策
  2. 以教学顺序引入概念 - 而非编译器顺序
  3. 使用有意义的代码块名称 - 2-5个单词的功能摘要(类似伪代码)
  4. 在代码块名称中引用变量 - 当代码块操作特定变量时,在名称中使用
    [[variable]]
    标记明确关联(例如
    <<add graders to [[graders]] list>>
  5. 按概念分解,而非语法
  6. 解释“原因” - 不要仅描述代码的功能。 仅用英文复述代码的文字毫无教学价值。优质的 文字应解释为何做出该设计选择:摒弃了哪些替代方案, 若不采用此方法会出现什么问题,或哪些约束条件驱动了实现。
    自我检验:如果您的文字可以通过函数签名机械生成,那它只是描述“内容”而非解释“原因”。请自问:本段文字论证了哪个设计决策?我们摒弃了哪个替代方案,原因是什么? 如果无法回答其中任何一个问题,请重写。
    反面示例 — 文字仅复述代码:
    noweb
    \subsection{Counting $n$-grams}
    
    We count overlapping $n$-grams.
    If $n$ is larger than the input, the result is empty.
    
    <<functions>>=
    def ngram_counts(text, *, n):
        ...
    @
    正面示例 — 文字解释设计选择的原因:
    noweb
    \subsection{Counting $n$-grams}
    
    We use overlapping $n$-grams because they capture all positional
    contexts---in \enquote{THE}, overlapping bigrams yield TH and HE,
    whereas non-overlapping would only yield TH.  This matches the
    standard definition used in cryptanalysis.
    
    <<functions>>=
    def ngram_counts(text, *, n):
        ...
    @
  7. 保持代码块聚焦 — 每个
    <<functions>>=
    代码块对应一个函数,前置说明文字
    。每个函数(或一组紧密相关的小型函数)应有独立的
    <<functions>>=
    代码块,并在前面添加解释性文字。切勿将多个不相关函数放在同一个代码块中。
    反面示例 — 四个函数挤在一个代码块中,仅配少量文字:
    noweb
    \subsection{Helper Functions}
    
    We provide several utility functions.
    
    <<functions>>=
    def normalize_text(text): ...
    
    def letters_only(text): ...
    
    def key_shifts(key): ...
    
    def index_of_coincidence(text): ...
    @
    正面示例 — 每个函数配独立小节和说明文字:
    noweb
    \subsection{Text Normalization}
    
    Before analysis, we strip non-alphabetic characters and
    convert to lowercase so that frequency counts are meaningful.
    
    <<functions>>=
    def normalize_text(text): ...
    @
    
    \subsection{Index of Coincidence}
    
    The index of coincidence measures how likely two randomly
    chosen letters from a text are identical ...
    
    <<functions>>=
    def index_of_coincidence(text): ...
    @
  8. 将长函数分解为命名子代码块 — 如果函数超过约25行,且包含两个或更多不同的算法阶段,请将其分解为命名子代码块。每个子代码块的名称应像算法描述中的步骤。每个子代码块前的文字解释该阶段为何采用此实现方式。这是经典的Knuth技巧。
    反面示例 — 80行函数仅配一行文字:
    noweb
    We generate plaintext by concatenating sentences.
    
    <<functions>>=
    def generate_plaintext(size, *, sources, seed=None):
        """..."""
        if size <= 0:
            raise ValueError(...)
        paragraphs = extract_paragraphs(sources, ...)
        ...  # 75 more lines
        return normalize(prefix, options)
    @
    正面示例 — 函数体分解为带说明文字的命名子代码块:
    noweb
    <<functions>>=
    def generate_plaintext(size, *, sources, seed=None):
        """..."""
        <<prepare filtered paragraphs>>
        <<pick random starting point>>
        <<collect sentences until target length>>
        <<select closest sentence boundary>>
    @
    
    We extract paragraphs from the corpus, removing headings and ToC
    entries.  Paragraphs lacking sentence-ending punctuation are
    discarded---they are typically list items or table rows.
    
    <<prepare filtered paragraphs>>=
    if size <= 0:
        raise ValueError("size must be positive")
    ...
    @
    
    To avoid always starting at the beginning of the corpus, we
    rotate to a random paragraph.
    
    <<pick random starting point>>=
    rng = random.Random(seed)
    ...
    @
  9. 使用桶式代码块 — 将
    <<constants>>=
    分散到相关代码附近
    - 在概念相关的小节中定义每个常量。切勿将所有常量集中到一个
    \subsection{Constants}
    小节中。
    反面示例 — 所有常量集中在一个小节:
    noweb
    \subsection{Constants}
    
    <<constants>>=
    DATA_DIR = ...        % used in loading section
    GUTENBERG_START = ... % used in extraction section
    SENTENCE_RE = ...     % used in sentence splitting section
    KEEP_PUNCT = ...      % used in normalization section
    @
    正面示例 — 每个常量靠近使用它的代码:
    noweb
    \subsection{Loading Texts}
    
    <<constants>>=
    DATA_DIR = Path(__file__).parent / "data"
    @
    
    <<functions>>=
    def load_text(path): ...
    @
    
    \subsection{Extracting Body Text}
    
    <<constants>>=
    GUTENBERG_START = "*** START OF"
    GUTENBERG_END = "*** END OF"
    @
    
    <<functions>>=
    def extract_body(text): ...
    @
  10. 为魔术数字定义常量 - 切勿硬编码数值
  11. 将依赖与功能放在一起 - 功能的导入语句放在对应功能的小节中
  12. 优先使用公共函数 - 默认让所有带文档字符串的函数成为公共函数。仅当函数是与单个调用者紧密耦合的真正内部助手时,才使用
    _
    前缀的私有函数。公共工具(如
    normalize_text
    ,
    letters_only
    )可跨模块复用,且可通过
    help()
    发现。模块间重复的私有助手(如
    vigenere.nw
    plaintexts.nw
    中的
    _to_ascii
    )表明该函数应在共享模块中设为公共函数。
  13. 行宽控制在80字符以内 - 文字和代码均需遵守

LaTeX Documentation Quality

LaTeX文档质量

Apply
latex-writing
skill. Most common anti-patterns in .nw files:
Lists with bold labels: Use
\begin{description}
with
\item[Label]
, NOT
\begin{itemize}
with
\item \textbf{Label}:
Code with manual escaping: Use
[[code]]
, NOT
\texttt{...\_...}
Manual quotes: Use
\enquote{...}
, NOT
"..."
or
 
...'' ``
Manual cross-references: Use
\cref{...}
, NOT
Section~\ref{...}
应用
latex-writing
技能。.nw文件中最常见的错误模式:
带粗体标签的列表:使用
\begin{description}
配合
\item[Label]
,而非
\begin{itemize}
配合
\item \textbf{Label}:
手动转义的代码:使用
[[code]]
,而非
\texttt{...\_...}
手动引号:使用
\enquote{...}
,而非
"..."
 
...'' ``
手动交叉引用:使用
\cref{...}
,而非
Section~\ref{...}

Progressive Disclosure Pattern

渐进式披露模式

When introducing high-level structure, use abstract placeholder chunks that defer specifics:
noweb
def cli_show(user_regex,
             <<options for filtering>>):
  <<implementation>>
@

[... later, explain each option ...]

\paragraph{The --all option}
<<options for filtering>>=
all: Annotated[bool, all_opt] = False,
@
Benefits: readable high-level structure, pedagogical ordering, maintainability.
The same technique applies to function bodies: long functions can use
<<phase name>>
sub-chunks to present algorithmic steps in pedagogical order with prose between them (see Writing Guideline 8, "Decompose long functions").
引入高层结构时,使用抽象占位符代码块延迟细节说明:
noweb
def cli_show(user_regex,
             <<options for filtering>>):
  <<implementation>>
@

[... 后续解释每个选项 ...]

\paragraph{The --all option}
<<options for filtering>>=
all: Annotated[bool, all_opt] = False,
@
优势:高层结构可读性强、符合教学顺序、易于维护。
同一技术也适用于函数体:长函数可使用
<<phase name>>
子代码块,按教学顺序呈现算法步骤,中间穿插说明文字(见编写指南第8条“分解长函数”)。

Chunk Concatenation Patterns

代码块拼接模式

Use multiple definitions when building up a parameter list pedagogically:
noweb
\subsection{Adding the diff flag}
<<args for diff>>=
diff=args.diff,
@

[... later ...]

\subsection{Fine-tuning thresholds}
<<args for diff>>=
threshold=args.threshold
@
Use separate chunks when contexts differ (different scopes):
noweb
<<args from command line>>=  # Has args object
diff=args.diff,
@

<<params for recursion>>=    # No args, only parameters
diff=diff,
@
多次定义以教学式构建参数列表:
noweb
\subsection{Adding the diff flag}
<<args for diff>>=
diff=args.diff,
@

[... 后续 ...]

\subsection{Fine-tuning thresholds}
<<args for diff>>=
threshold=args.threshold
@
使用独立代码块区分不同上下文(不同作用域):
noweb
<<args from command line>>=  # 有args对象
diff=args.diff,
@

<<params for recursion>>=    # 无args,仅参数
diff=diff,
@

Test Organization

测试组织

CRITICAL: Tests MUST appear AFTER implementation, distributed throughout the file near the code they verify. NEVER create a
\section{Tests}
or
\section{Unit Tests}
that groups all tests at the end of the file.
See
references/testing-patterns.md
for detailed patterns.
Key rules:
  • Each implementation section is followed by its
    <<test functions>>=
    chunk
  • Use single
    <<test functions>>
    chunk name — noweb concatenates them
  • Use
    from module import *
    in the test file header
  • Frame tests pedagogically: "Let's verify this works..."
BAD — all tests collected at the end:
noweb
\section{Encryption}
<<functions>>=
def encrypt(text, key): ...
@

\section{Decryption}
<<functions>>=
def decrypt(text, key): ...
@

\section{Tests}          % ← NEVER do this

<<test functions>>=
def test_encrypt(): ...
def test_decrypt(): ...
@
GOOD — each test immediately after its implementation:
noweb
\section{Encryption}
<<functions>>=
def encrypt(text, key): ...
@

Let's verify that encryption produces the expected ciphertext:

<<test functions>>=
def test_encrypt(): ...
@

\section{Decryption}
<<functions>>=
def decrypt(text, key): ...
@

We can verify that decryption inverts encryption:

<<test functions>>=
def test_decrypt(): ...
@
重要提示:测试必须在实现之后编写,并分散在文件中靠近对应代码的位置。绝对禁止创建
\section{Tests}
\section{Unit Tests}
将所有测试集中在文件末尾。
详见
references/testing-patterns.md
中的详细模式。
核心规则:
  • 每个实现小节后紧跟其
    <<test functions>>=
    代码块
  • 使用单个
    <<test functions>>
    代码块名称 — noweb会自动拼接它们
  • 在测试文件头部使用
    from module import *
  • 以教学方式组织测试:“让我们验证此功能是否正常工作...”
反面示例 — 所有测试集中在末尾:
noweb
\section{Encryption}
<<functions>>=
def encrypt(text, key): ...
@

\section{Decryption}
<<functions>>=
def decrypt(text, key): ...
@

\section{Tests}          % ← 绝对禁止

<<test functions>>=
def test_encrypt(): ...
def test_decrypt(): ...
@
正面示例 — 每个测试紧跟其实现:
noweb
\section{Encryption}
<<functions>>=
def encrypt(text, key): ...
@

让我们验证加密是否生成预期的密文:

<<test functions>>=
def test_encrypt(): ...
@

\section{Decryption}
<<functions>>=
def decrypt(text, key): ...
@

我们可以验证解密是否能还原加密内容:

<<test functions>>=
def test_decrypt(): ...
@

Multi-Directory Projects

多目录项目

For large projects (5+ .nw files), see
references/multi-directory-projects.md
.
Key structure:
project/
├── Makefile       # Root orchestrator (compile → test → docs)
├── pyproject.toml # Poetry packaging configuration
├── src/           # .nw files → .py + .tex
├── doc/           # Document wrapper (.nw), preamble.tex
├── tests/         # Extracted test files (unit/ subdir)
└── makefiles/     # Shared build rules (noweb.mk, subdir.mk)
对于大型项目(5个以上.nw文件),详见
references/multi-directory-projects.md
核心结构:
project/
├── Makefile       # 根编排文件(编译 → 测试 → 文档)
├── pyproject.toml # Poetry打包配置
├── src/           # .nw文件 → .py + .tex
├── doc/           # 文档包装器(.nw)、preamble.tex
├── tests/         # 提取的测试文件(unit/子目录)
└── makefiles/     # 共享构建规则(noweb.mk, subdir.mk)

Initializing a New Project

初始化新项目

See
references/project-initialization.md
for full details. Quick checklist:
  1. Create
    pyproject.toml
    with
    [tool.poetry]
    packages/include/exclude
  2. Create
    src/.gitignore
    (
    *.py
    ,
    *.tex
    ) and
    tests/.gitignore
    (
    *.py
    )
  3. Create
    src/packagename/Makefile
    with explicit
    __init__.py
    rule
  4. Create
    src/packagename/packagename.nw
    with
    <<[[__init__.py]]>>
    and
    <<test [[packagename.py]]>>
    chunks
  5. Create
    tests/Makefile
    with auto-discovery (uses
    %20
    encoding,
    cpif
    ,
    unit/
    subdirectory)
  6. Create
    doc/packagename.nw
    wrapper,
    doc/Makefile
    ,
    doc/preamble.tex
  7. Create root
    Makefile
    orchestrating compile → test → docs
详见
references/project-initialization.md
的完整说明。快速检查清单:
  1. 创建带
    [tool.poetry]
    包/包含/排除配置的
    pyproject.toml
  2. 创建
    src/.gitignore
    *.py
    ,
    *.tex
    )和
    tests/.gitignore
    *.py
  3. 创建带显式
    __init__.py
    规则的
    src/packagename/Makefile
  4. 创建带
    <<[[__init__.py]]>>
    <<test [[packagename.py]]>>
    代码块的
    src/packagename/packagename.nw
  5. 创建带自动发现功能的
    tests/Makefile
    (使用
    %20
    编码、
    cpif
    unit/
    子目录)
  6. 创建
    doc/packagename.nw
    包装器、
    doc/Makefile
    doc/preamble.tex
  7. 创建根
    Makefile
    ,编排编译 → 测试 → 文档的流程

LaTeX-Safe Chunk Names

LaTeX安全的代码块名称

Use
[[...]]
notation for Python chunks with underscores:
noweb
<<[[module_name.py]]>>=
def my_function():
    pass
@
Extract with:
notangle -R"[[module_name.py]]" file.nw > module_name.py
对带下划线的Python代码块使用
[[...]]
标记:
noweb
<<[[module_name.py]]>>=
def my_function():
    pass
@
提取命令:
notangle -R"[[module_name.py]]" file.nw > module_name.py

Best Practices Summary

最佳实践总结

  1. Write documentation first - then add code
  2. Keep lines under 80 characters
  3. Check for unused chunks - run
    noroots
    to find typos
  4. Keep tangled code in .gitignore - .nw is source of truth
  5. NEVER commit generated files - .py and .tex from .nw are build artifacts
  6. Test your tangles - ensure extracted code runs
  7. Require PEP-257 docstrings on all public functions - Prose in
    .nw
    is for maintainers reading the literate source; docstrings are for users of the compiled
    .py
    who never see the
    .nw
    file. Both are needed. Private functions (prefixed
    _
    ) may omit docstrings. Never use
    \cref
    or other LaTeX commands inside docstrings.
    BAD — function with prose but no docstring:
    noweb
    We convert text to lowercase ASCII for uniform comparison.
    
    <<functions>>=
    def normalize_text(text):
        return text.lower().encode("ascii", "ignore").decode()
    @
    GOOD — prose for maintainers AND docstring for users:
    noweb
    We convert text to lowercase ASCII for uniform comparison.
    
    <<functions>>=
    def normalize_text(text):
        """Return lowercase ASCII version of ``text``.
    
        Non-ASCII characters are silently dropped.
        """
        return text.lower().encode("ascii", "ignore").decode()
    @
  8. Include table of contents - add
    \tableofcontents
    in documentation
  1. 先写文档 - 再添加代码
  2. 行宽控制在80字符以内
  3. 检查未使用的代码块 - 运行
    noroots
    查找拼写错误
  4. 将生成的代码加入.gitignore - .nw是唯一可信源
  5. 绝对禁止提交生成的文件 - 从.nw生成的.py和.tex是构建产物
  6. 测试代码提取结果 - 确保提取的代码可运行
  7. 要求所有公共函数符合PEP-257文档字符串规范 - .nw中的文字是给维护者阅读的文学源码;文档字符串是给用户阅读的编译后
    .py
    文件的说明。两者缺一不可。私有函数(带
    _
    前缀)可省略文档字符串。切勿在文档字符串中使用
    \cref
    或其他LaTeX命令。
    反面示例 — 仅带文字说明但无文档字符串的函数:
    noweb
    We convert text to lowercase ASCII for uniform comparison.
    
    <<functions>>=
    def normalize_text(text):
        return text.lower().encode("ascii", "ignore").decode()
    @
    正面示例 — 同时为维护者提供文字说明、为用户提供文档字符串:
    noweb
    We convert text to lowercase ASCII for uniform comparison.
    
    <<functions>>=
    def normalize_text(text):
        """Return lowercase ASCII version of ``text``.
    
        Non-ASCII characters are silently dropped.
        """
        return text.lower().encode("ascii", "ignore").decode()
    @
  8. 包含目录 - 在文档中添加
    \tableofcontents

Git Workflow

Git工作流

See
references/git-workflow.md
for details.
Core rules:
  • Only commit .nw files to git
  • Add generated files to .gitignore immediately
  • Regenerate code with
    make
    after checkout/pull
  • Never commit generated .py or .tex files
详见
references/git-workflow.md
的说明。
核心规则
  • 仅将.nw文件提交到git
  • 立即将生成的文件加入.gitignore
  • 检出/拉取代码后使用
    make
    重新生成代码
  • 绝对禁止提交生成的.py或.tex文件

Noweb Commands Quick Reference

Noweb命令速查

See
references/noweb-commands.md
for details.
bash
undefined
详见
references/noweb-commands.md
的说明。
bash
undefined

Tangling

代码提取(tangle)

notangle -R"[[module.py]]" file.nw > module.py noroots file.nw # List root chunks
notangle -R"[[module.py]]" file.nw > module.py noroots file.nw # 列出根代码块

Weaving

文档生成(weave)

noweave -n -delay -x -t2 file.nw > file.tex # For inclusion noweave -latex -x file.nw > file.tex # Standalone
undefined
noweave -n -delay -x -t2 file.nw > file.tex # 用于嵌入的文档 noweave -latex -x file.nw > file.tex # 独立文档
undefined

When Literate Programming Is Valuable

文学编程的适用场景

  • Complex algorithms requiring detailed explanation
  • Educational code where understanding is paramount
  • Code maintained by others
  • Programs where design decisions need documentation
  • Projects combining multiple languages/tools
  • 需要详细解释的复杂算法
  • 以理解为首要目标的教学代码
  • 由他人维护的代码
  • 设计决策需要文档记录的程序
  • 结合多种语言/工具的项目