tech-article-reproducibility

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Tech Article Reproducibility

技术文章可复现性评估

Measure the quality of a technical article from the angle of "can a reader reproduce the same thing on their machine?" This is an independent axis from prose-style evaluation (mizchi-blog-style) or logical evaluation. The premise: the most important thing about a technical article is whether a reader can reproduce it on their own machine.

从“读者能否在自己的机器上复现相同内容”的角度衡量技术文章的质量。这是与文笔风格评估（mizchi-blog-style）或逻辑评估相互独立的维度。核心前提：技术文章最重要的一点，就是读者能否在自己的机器上复现其中内容。

When to use

使用场景

Final pre-publication check on a technical article draft
Hands-on articles / tutorial articles
Tool introduction articles / setup articles
Verifying an article that claims "it worked"

When not to use:

Conceptual explainer articles (nothing to reproduce)
Poems / opinion pieces
Self-contained small tidbits

技术文章草稿发布前的最终检查
实操类文章/教程类文章
工具介绍类文章/环境搭建类文章
验证声称“已成功运行”的文章

不适用场景：

概念解释类文章（无内容可复现）
诗歌/观点类文章
独立的小技巧分享

Reproducibility check axes (10 axes)

可复现性评估维度（10项）

Score each axis on a 0–2 scale, 20 points total → converted to a 10-point scale.

#	Axis	0 (NG)	1 (partial)	2 (OK)
1	Environment prerequisites stated	No OS / version / required tools listed	Partially listed	Everything listed (OS, lang version, CLI tools)
2	Code completeness	Fragments only, imports/setup omitted	Only the main part	Full, copy-pasteable form that runs
3	Command accuracy	Placeholders left as-is ( `<your-token>` etc. without explanation)	Some placeholders	Runnable as-is
4	Version dependency stated	No mention	Partial	Explicit, e.g. "works on v3.x", "v2 or earlier behaves as X"
5	Full config files included	Excerpts only	Main keys only	Full minimal working config
6	Expected output shown	None	Explained in prose	Actual output / screenshot
7	Handling of errors	Not mentioned	One case touched on	Several major errors + how to handle them
8	Project prerequisites stated	Author-environment assumptions are implicit	Partially stated	Paths / repo structure / existing config all stated
9	Link health	Links broken or require auth	Some require auth	All accessible publicly
10	Author-specific knowledge stated	Helpers / dotfiles assumed implicitly	Partially stated	Fully stated or not required

每个维度按0-2分打分，总分20分→转换为10分制。

序号	评估维度	0分（不合格）	1分（部分合格）	2分（合格）
1	环境前提说明	未列出操作系统/版本/所需工具	仅部分列出	完整列出所有内容（操作系统、语言版本、CLI工具）
2	代码完整性	仅提供代码片段，省略导入/初始化代码	仅提供核心代码	提供完整、可直接复制粘贴运行的代码
3	命令准确性	占位符未做说明直接保留（如 `<your-token>` 等）	存在部分占位符	命令可直接运行
4	版本依赖说明	未提及版本依赖	部分提及	明确说明，例如“适用于v3.x版本”、“v2及更早版本表现为X”
5	完整配置文件提供	仅提供配置片段	仅提供核心配置项	提供完整的最小可用配置
6	预期输出展示	未展示预期输出	仅用文字描述预期输出	提供实际输出内容/截图
7	错误处理说明	未提及错误处理	仅提及一种错误情况	覆盖多种主要错误及对应的解决方法
8	项目前提说明	隐含作者自身环境的假设	部分说明项目前提	完整说明路径/仓库结构/已有配置等信息
9	链接有效性	链接失效或需要权限访问	部分链接需要权限	所有链接均可公开访问
10	作者专属知识说明	隐含假设读者了解辅助工具/配置文件（dotfiles）	部分说明作者专属知识	完整说明或无需此类知识

Evaluation workflow

评估流程

For evaluating technical articles, use the same subagent dispatch as empirical-prompt-tuning. The difference is that the subagent plays the role of "a first-time reader trying to reproduce the work" rather than "an executor."

Fix the target article
subagent dispatch (template below)
Extract "reproduction sticking points" from the returned evaluation
Add / fix text in the article to address those sticking points
If needed, re-evaluate with a fresh subagent

评估技术文章时，使用与empirical-prompt-tuning相同的subagent调度方式。区别在于，subagent扮演的是**“首次阅读并尝试复现内容的读者”**，而非“执行者”角色。

确定目标文章
调度subagent（模板如下）
从返回的评估结果中提取“复现卡点”
在文章中补充/修改内容以解决这些卡点
如有需要，使用新的subagent重新评估

subagent dispatch template

subagent调度模板

You are a reader interested in <the article's subject area> but new to <the tech stack>.
You are going to read this article and try to reproduce the same thing in your local environment.

You are a reader interested in <the article's subject area> but new to <the tech stack>.
You are going to read this article and try to reproduce the same thing in your local environment.

Target article

Evaluation axes (10 reproducibility axes)

Score each axis 0–2. Refer to the rubric in the

tech-article-reproducibility

skill: /Users/mz/.claude/skills/tech-article-reproducibility/SKILL.md

Environment prerequisites stated
Code completeness
Command accuracy
Version dependency stated
Full config files included
Expected output shown
Handling of errors
Project prerequisites stated
Link health (actually verify with WebFetch)
Author-specific knowledge stated

Score each axis 0–2. Refer to the rubric in the

tech-article-reproducibility

skill: /Users/mz/.claude/skills/tech-article-reproducibility/SKILL.md

Environment prerequisites stated
Code completeness
Command accuracy
Version dependency stated
Full config files included
Expected output shown
Handling of errors
Project prerequisites stated
Link health (actually verify with WebFetch)
Author-specific knowledge stated

Tasks

While reading the article, imagine "where would I get stuck if I reproduced this on my own machine?"
Score each axis 0–2 with quoted evidence
List the top 5 sticking points with line numbers

While reading the article, imagine "where would I get stuck if I reproduced this on my own machine?"
Score each axis 0–2 with quoted evidence
List the top 5 sticking points with line numbers

Report structure

Reproducibility score: X/20 (breakdown table)
Top 5 sticking points: <line number> <quote> → <why it sticks>
Missing information: list of things that should be added to the article
Overall verdict: what percentage chance (subjective) do you have of reproducing this after reading the article

undefined

Reproducibility score: X/20 (breakdown table)
Top 5 sticking points: <line number> <quote> → <why it sticks>
Missing information: list of things that should be added to the article
Overall verdict: what percentage chance (subjective) do you have of reproducing this after reading the article

undefined

How to read the score

分数解读

18-20: Publishable as a hands-on piece; almost no additional information needed
14-17: Some googling required, but reproducible; okay to publish
10-13: Information outside the article is required to reproduce; revisions recommended
9 or below: Hard to reproduce; rethink the article's premise or position it as something other than a hands-on piece

18-20分：可作为实操类文章发布；几乎无需补充额外信息
14-17分：需要少量搜索，但可复现；可以发布
10-13分：复现需要文章外的信息；建议修订
9分及以下：难以复现；需重新考量文章定位，或不作为实操类文章发布

Pitfalls

注意事项

The evaluator's background knowledge is too high: if you don't explicitly tell the subagent to play a "beginner role," it will judge "enough information" from an expert's viewpoint. Emphasize "first-time reader" in the prompt
Ignoring link health: links that are alive at publication time can break a year later. Separately check whether reproduction is possible using only live links
Inlining all sample code: reproducibility goes up, but the article bloats. A hybrid approach that combines inline code with a link to the repository is realistic
Reproducibility ≠ prose quality: an article can be highly reproducible yet hard to read. Combine with
```
mizchi-blog-style
```
and similar to measure both axes

评估者背景知识过高：如果未明确要求subagent扮演“初学者”角色，它会从专家视角判断“信息是否足够”。需在提示中强调“首次阅读者”身份
忽略链接有效性：发布时有效的链接可能在一年后失效。需单独检查仅通过有效链接是否能完成复现
内嵌所有示例代码：可复现性会提升，但文章会变得冗长。结合内嵌代码与仓库链接的混合方式更为合理
可复现性≠文笔质量：一篇文章可能具备高可复现性但可读性差。建议结合
```
mizchi-blog-style
```
等工具同时评估两个维度

tech-article-reproducibility

Original

Translation

Tech Article Reproducibility

技术文章可复现性评估

When to use

使用场景

Reproducibility check axes (10 axes)

可复现性评估维度（10项）

Evaluation workflow

评估流程

subagent dispatch template

subagent调度模板

Target article

Target article

Evaluation axes (10 reproducibility axes)

Evaluation axes (10 reproducibility axes)

Tasks

Tasks

Report structure

Report structure

How to read the score

分数解读

Pitfalls

注意事项

Related

相关技能