paper-notes

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Paper Notes

论文笔记

Produce consistent, searchable paper notes that later steps (claims, visuals, writing) can reliably synthesize.

This is still NO PROSE: keep notes as bullets / short fields, not narrative paragraphs.

生成一致、可检索的论文笔记，供后续步骤（论点撰写、可视化制作、正文写作）可靠地整合使用。

注意：禁止使用散文形式：笔记需以项目符号/简短字段呈现，而非叙事段落。

Role cards (prompt-level guidance)

角色卡片（提示级指导）

Close Reader
- Mission: extract what is specific and checkable (setup, method, metrics, limits).
- Do: name concrete tasks/benchmarks and what the paper actually measures.
- Avoid: generic summary boilerplate that could fit any paper.
Results Recorder
- Mission: capture evaluation anchors that later writing needs.
- Do: record task + metric + constraints (budget/tool access) whenever available.
- Avoid: copying numbers without the evaluation setting that makes them meaningful.
Limitation Logger
- Mission: capture the caveats that change interpretation.
- Do: write paper-specific limitations (protocol mismatch, missing ablations, threat model gaps).
- Avoid: repeated generic limitations like “may not generalize” without specifics.

精读者
- 目标：提取具体且可核查的内容（实验设置、方法、指标、局限性）。
- 要做：明确写出具体任务/基准，以及论文实际测量的内容。
- 避免：使用可套用于任何论文的通用摘要模板。
结果记录者
- 目标：记录后续写作所需的评估锚点。
- 要做：尽可能记录任务+指标+约束条件（预算/工具权限）。
- 避免：仅复制数字，却忽略赋予其意义的评估设置。
局限性记录者
- 目标：记录会影响解读的注意事项。
- 要做：撰写针对该论文的具体局限性（如协议不匹配、缺少消融实验、威胁模型存在漏洞）。
- 避免：重复使用“可能不具备泛化性”这类无具体内容的通用局限性描述。

When to use

适用时机

After you have a core set (and ideally a mapping) and need evidence-ready notes.
Before writing a survey draft.

已拥有核心论文集（理想情况下还有映射表），且需要准备好可作为证据的笔记时。
撰写综述初稿之前。

Inputs

输入项

```
papers/core_set.csv
```
Optional:
```
outline/mapping.tsv
```
(to prioritize)

Optional:

papers/fulltext_index.jsonl

papers/fulltext/*.txt

(if running in fulltext mode)

```
papers/core_set.csv
```
可选：
```
outline/mapping.tsv
```
（用于确定优先级）

可选：

papers/fulltext_index.jsonl

papers/fulltext/*.txt

（若运行在全文模式下）

Outputs

输出项

```
papers/paper_notes.jsonl
```
(JSONL; one record per paper)
```
papers/evidence_bank.jsonl
```
(JSONL; addressable evidence snippets derived from notes; A150++ target: >=7 items/paper on average)

```
papers/paper_notes.jsonl
```
（JSONL格式；每篇论文对应一条记录）
```
papers/evidence_bank.jsonl
```
（JSONL格式；从笔记中提取的可定位证据片段；A150++标准目标：平均每篇论文至少7条片段）

Decision: evidence depth

决策：证据深度

If you have extracted text (
```
papers/fulltext/*.txt
```
) → enrich key papers using fulltext snippets and set
```
evidence_level: "fulltext"
```
.
If you only have abstracts (default) → keep long-tail notes abstract-level, but still fully enrich high-priority papers (see below).

若已提取全文文本（
```
papers/fulltext/*.txt
```
）→ 使用全文片段补充关键论文的内容，并设置
```
evidence_level: "fulltext"
```
。
若仅拥有摘要（默认情况）→ 长尾论文的笔记保持摘要层面即可，但仍需完整补充高优先级论文的内容（见下文）。

Workflow (heuristic)

工作流程（启发式）

Uses:

outline/mapping.tsv

papers/fulltext_index.jsonl

Ensure coverage: every
```
paper_id
```
in
```
papers/core_set.csv
```
must have one JSONL record.
Use mapping to choose high-priority papers:
- heavily reused across subsections
- pinned classics (ReAct/Toolformer/Reflexion… if in scope)
For high-priority papers, capture:
- 3–6 summary bullets (what’s new, what problem setting, what’s the loop)
- ```
method
```
  (mechanism and architecture; what differs from baselines)
- ```
key_results
```
  (benchmarks/metrics; include numbers if available)
- ```
limitations
```
  (specific assumptions/failure modes; avoid generic boilerplate)
For long-tail papers:
- keep summary bullets short (abstract-derived is OK)
- still include at least one limitation, but make it specific when possible
Assign a stable
```
bibkey
```
for each paper for citation generation.

使用工具：

outline/mapping.tsv

、

papers/fulltext_index.jsonl

覆盖性检查：
```
papers/core_set.csv
```
中的每个
```
paper_id
```
必须在JSONL文件中有对应的记录。
利用映射表选择高优先级论文：
- 在多个子章节中被大量引用的论文
- 经典标杆论文（如ReAct/Toolformer/Reflexion…若在研究范围内）
针对高优先级论文，需记录：
- 3-6条摘要要点（创新点、问题场景、核心循环逻辑）
- ```
method
```
  （机制与架构；与基线方法的差异）
- ```
key_results
```
  （基准测试/指标；若有数据请包含具体数值）
- ```
limitations
```
  （具体假设/失效模式；避免通用模板内容）
针对长尾论文：
- 摘要要点需简洁（基于摘要生成即可）
- 仍需至少包含一条局限性描述，尽可能具体化
为每篇论文分配一个固定的
```
bibkey
```
，用于生成引用。

Quality checklist

质量检查清单

Coverage: every

paper_id

papers/core_set.csv

appears in

papers/paper_notes.jsonl

High-priority papers have non-
```
TODO
```
method/results/limitations.
Limitations are not copy-pasted across many papers.
```
evidence_level
```
is set correctly (
```
abstract
```
vs
```
fulltext
```
).
Evidence bank:
```
papers/evidence_bank.jsonl
```
exists and is dense enough for A150++ (>=7 items/paper on average).

覆盖性：

papers/core_set.csv

中的每个

paper_id

都已出现在

papers/paper_notes.jsonl

中。

高优先级论文的方法/结果/局限性字段无
```
TODO
```
占位符。
局限性描述未在多篇论文中重复复制。
```
evidence_level
```
设置正确（
```
abstract
```
或
```
fulltext
```
）。
证据库：
```
papers/evidence_bank.jsonl
```
已生成，且密度达到A150++标准（平均每篇论文至少7条片段）。

Helper script (optional)

辅助脚本（可选）

Quick Start

快速开始

python .codex/skills/paper-notes/scripts/run.py --help

python .codex/skills/paper-notes/scripts/run.py --workspace <workspace_dir>

python .codex/skills/paper-notes/scripts/run.py --help

python .codex/skills/paper-notes/scripts/run.py --workspace <workspace_dir>

All Options

所有选项

See
```
--help
```
(this helper is intentionally minimal)

查看
```
--help
```
（此辅助工具设计得尽量精简）

Examples

示例

Generate notes, then optionally enrich
```
priority=high
```
papers:
- Run the helper once, then refine
```
papers/paper_notes.jsonl
```
  (e.g., add full-text details for key papers and diversify limitations).

生成笔记后，可选择性补充
```
priority=high
```
的论文内容：
- 运行一次辅助工具，然后优化
```
papers/paper_notes.jsonl
```
  （例如，为关键论文添加全文细节，丰富局限性描述的多样性）。

Notes

注意事项

The helper writes deterministic metadata/abstract-level notes and marks key papers with
```
priority=high
```
.
In
```
pipeline.py --strict
```
it will be blocked if high-priority notes are incomplete (missing method/key_results/limitations) or contain placeholders.

该辅助工具会生成确定性的元数据/摘要级笔记，并将关键论文标记为
```
priority=high
```
。
在
```
pipeline.py --strict
```
模式下，若高优先级笔记不完整（缺少方法/关键结果/局限性）或包含占位符，流程将被阻断。

Troubleshooting

故障排除

Common Issues

常见问题

Issue: High-priority notes still look like scaffolds

问题：高优先级笔记仍为框架内容

Symptom:

Quality gate reports missing
```
method/key_results
```
or
```
TODO
```
placeholders.

Causes:

Notes were generated from abstracts only; key papers weren’t enriched.

Solutions:

Fully enrich

priority=high

papers:

method

, ≥1

key_results

, ≥3

summary_bullets

, ≥1 concrete

limitations

If you need full text evidence, run
```
pdf-text-extractor
```
in
```
fulltext
```
mode for key papers.

症状：

质量检查提示缺少
```
method/key_results
```
字段或存在
```
TODO
```
占位符。

原因：

笔记仅基于摘要生成，关键论文未补充全文内容。

解决方案：

完整补充
```
priority=high
```
论文的内容：包含
```
method
```
、至少1条
```
key_results
```
、至少3条
```
summary_bullets
```
、至少1条具体的
```
limitations
```
。
若需全文证据，为关键论文在
```
fulltext
```
模式下运行
```
pdf-text-extractor
```
。

Issue: Repeated limitations across many papers

问题：多篇论文的局限性描述重复

Symptom:

Quality gate reports repeated limitation boilerplate.

Causes:

Copy-pasted limitations instead of paper-specific failure modes/assumptions.

Solutions:

Replace boilerplate with paper-specific limitations (setup, data, evaluation gaps, failure cases).

症状：

质量检查提示存在重复的局限性模板内容。

原因：

直接复制粘贴局限性描述，未针对各论文的具体失效模式/假设进行撰写。

解决方案：

用针对论文的具体局限性描述替代模板内容（如实验设置、数据、评估漏洞、失效案例等）。

Recovery Checklist

恢复检查清单

papers/paper_notes.jsonl

covers all

papers/core_set.csv

paper_ids.

≥80% of
```
priority=high
```
notes satisfy method/results/limitations completeness.
No
```
TODO
```
remains in high-priority notes.

papers/paper_notes.jsonl

已覆盖

papers/core_set.csv

中的所有paper_id。

至少80%的
```
priority=high
```
笔记的方法/结果/局限性字段完整。
高优先级笔记中无
```
TODO
```
占位符残留。