manuscript-review

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Manuscript Review Skill

手稿评审技能

Purpose

用途

Execute a comprehensive, multi-pass diagnostic audit of an academic or technical manuscript, producing a structured improvement report that identifies issues across 24 audit dimensions — from macro-coherence and argumentative architecture through claims-evidence calibration, narrative flow, prose microstructure, rendered visual inspection, and cross-element coherence, down to citation hygiene and reproducibility.

The output is a prioritized, actionable improvement plan — not a line edit. The goal is to surface structural, logical, and clarity issues that authors systematically miss because they're too close to the text.

Optimized for arXiv/preprint submissions with flexible compliance standards.

Companion skill:

manuscript-provenance

audits whether manuscript content (numbers, tables, figures, ordering, terminology) is computationally derived from code and scripts. This skill audits the document as prose; that skill audits computational grounding. Run both for complete pre-publication coverage.

对学术或技术手稿执行全面的多轮诊断审核，生成结构化改进报告，覆盖24个审核维度的问题——从宏观连贯性、论证架构，到主张-证据校准、叙事流畅度、文本微观结构、渲染可视化检查、跨元素一致性，再到引用规范和可复现性。

输出是优先级排序的可执行改进计划，而非逐行编辑。目标是挖掘作者因对文本过于熟悉而系统性忽略的结构、逻辑和清晰度问题。

针对合规标准更灵活的arXiv/预印本投稿做了优化。

配套技能：

manuscript-provenance

用于审核手稿内容（数值、表格、图、排序、术语）是否从代码和脚本计算生成。本技能审核文档的文本内容，

manuscript-provenance

审核计算来源。同时运行两者可实现完整的预发表覆盖。

Boundary Agreement with manuscript-provenance

与manuscript-provenance的边界约定

Concern	This skill (manuscript-review)	manuscript-provenance
Reproducibility	Does the paper describe enough to reproduce? (§6)	Does the code actually produce what the paper claims? (§1, §7)
Figures/Tables	Legible, accessible, well-formatted? (§12)	Generated by scripts, not manual entry? (§2, §3)
Rendered visuals	Readable at print scale? Floats near references? (§23)	Figure generation script produces correct format? (§3)
Hyperparameters	Listed in the paper with rationale? (§6)	Values trace to config files, not hardcoded? (§1, §8)
Code availability	Statement exists in the paper? (§17)	Repo URL valid, README accurate, pipeline works? (§11)
Terminology	Abbreviations consistent within document? (§14)	Terms match code identifiers? (§5)
Significant figures	Consistent precision within document? (§12)	Precision matches script output? (§2)
Figure format	Appropriate format for document quality? (§12)	Format generated by script, not manually exported? (§3)
Computational cost	Reported in the paper? (§7)	Values trace to benchmarking scripts? (§1)
Macro-prose coherence	Prose framing appropriate for injected value? (§24)	Value traced to code, macro manifest produced? (§4)
Cross-element consistency	Prose, captions, figures, tables mutually consistent? (§24)	All elements from same run/pipeline output? (§9)

Rule: This skill never opens the codebase. manuscript-provenance never judges prose quality. Each reads the other's report when available.

Integration point — Macro Manifest: manuscript-provenance produces a macro manifest as part of its §4 audit: a structured list of every macro-injected value, its resolved numeric value, its source (script + output file), and its location(s) in the manuscript text. This skill's Pass 13 (Cross-Element Coherence) consumes that manifest to check whether the prose surrounding each injected value is appropriate for the actual value. If no provenance report exists, this skill extracts macro values directly from

.tex

source (less precise — no source tracing, but coherence check still runs).

关注项	本技能（manuscript-review）	manuscript-provenance
可复现性	论文是否描述了足够的复现信息？（§6）	代码是否真的能生成论文声称的结果？（§1, §7）
图/表	清晰易读、符合可访问性要求、格式规范？（§12）	由脚本生成而非手动录入？（§2, §3）
渲染可视化	打印尺寸下可读？浮动位置靠近引用处？（§23）	图生成脚本输出格式正确？（§3）
超参数	论文中列出并附带选择依据？（§6）	取值可追溯到配置文件而非硬编码？（§1, §8）
代码可用性	论文中存在相关声明？（§17）	仓库URL有效、README准确、流水线可运行？（§11）
术语	缩写在文档内使用一致？（§14）	术语与代码标识符匹配？（§5）
有效数字	文档内精度一致？（§12）	精度与脚本输出匹配？（§2）
图格式	符合文档质量要求的合适格式？（§12）	格式由脚本生成而非手动导出？（§3）
计算成本	论文中做了说明？（§7）	取值可追溯到基准测试脚本？（§1）
宏观文本连贯性	文本表述与注入值匹配？（§24）	取值可追溯到代码，已生成宏观清单？（§4）
跨元素一致性	文本、图注、图、表相互一致？（§24）	所有元素来自同一次运行/流水线输出？（§9）

规则： 本技能从不访问代码库。

manuscript-provenance

从不评判文本质量。双方在有对方报告时可读取参考。

集成点——宏观清单：

manuscript-provenance

在其第4节审核中会生成宏观清单：结构化列出所有宏注入值、解析后的数值、来源（脚本+输出文件）以及在手稿文本中的位置。本技能的第13轮审核（跨元素一致性）会使用该清单，检查每个注入值周边的文本表述是否与实际值匹配。如果没有溯源报告，本技能会直接从

.tex

源文件提取宏值（精度较低——没有来源追溯，但仍可执行一致性检查）。

Workflow

工作流

1. Ingest

1. 内容摄入

Read the uploaded manuscript. Accept PDF, DOCX, LaTeX source, or Markdown. If multiple files are uploaded (e.g., main text + supplementary), process all of them.

Identify:

Target venue (defaults to arXiv/preprint; adjust if conference/journal submission)
Submission type (full paper, technical report, thesis chapter, etc.)
Any specific concerns the user raised — these get priority in the report

For arXiv submissions, compliance checks are advisory. Focus on technical quality, reproducibility, and clarity rather than strict formatting rules.

读取上传的手稿，支持PDF、DOCX、LaTeX源码、Markdown格式。如果上传了多个文件（例如正文+补充材料），处理所有文件。

识别以下信息：

目标投稿渠道（默认是arXiv/预印本；如果是会议/期刊投稿可调整）
投稿类型（全文、技术报告、学位论文章节等）
用户提出的任何特定问题——这些会在报告中优先展示

针对arXiv投稿，合规检查仅作建议。重点关注技术质量、可复现性和清晰度，而非严格的格式规则。

2. Load the Checklist

2. 加载核查清单

Read

references/checklist.md

— the comprehensive 24-section, ~175-checkpoint refactoring checklist. Every audit pass is structured against this checklist.

text

Read references/checklist.md

读取

references/checklist.md

——包含24个章节、约175个检查点的完整重构核查清单。所有审核轮次都基于该清单结构化开展。

text

Read references/checklist.md

3. Multi-Pass Audit

3. 多轮审核

Execute the following passes sequentially. Each pass maps to one or more checklist sections. Work systematically — for each checkpoint:

PASS: Note briefly, move on
FAIL: Document with exact location (section, paragraph, line), specific defect, concrete fix required
N/A: Mark if not applicable to this manuscript type

Pass 1 — Structural Integrity (Checklist §1, §4, §5, §10)

Trace the thesis-thread from abstract through conclusion
Verify section-level necessity and logical dependency ordering
Check introduction funnel structure and contribution enumeration
Verify conclusion contains no new information and maps 1:1 to stated contributions
Assess related work organization (taxonomic vs. annotated) and differentiation

Pass 2 — Abstract & Title Calibration (Checklist §2, §3)

Abstract functional completeness (context → gap → approach → results → implication)
Quantitative specificity in abstract
Title precision-scope alignment
Keyword-abstract coherence

Pass 3 — Technical Rigor (Checklist §6, §7)

Reproducibility sufficiency of methodology (document-level: does the paper describe enough? Code-level verification deferred to manuscript-provenance)
Assumption explicitness and notation consistency
Baseline adequacy, dataset characterization, statistical rigor
Effect size reporting, evaluation metric justification
Computational cost reporting (checks paper reports it; value tracing to benchmarking scripts deferred to manuscript-provenance)

Pass 4 — Argumentation Quality (Checklist §8, §9)

Discussion introduces no new results
Alternative explanations considered
Generalizability boundaries stated
Limitations genuine (not performative), preemptively addressing reviewer objections
Threat-to-validity taxonomy coverage

Pass 5 — Citation & Reference Hygiene (Checklist §11)

Citation-reference bijection (no orphans in either direction)
Style conformance to target venue
Primary source preference over secondary citations
Preprint-to-publication status check
Citation placement (claim-level, not paragraph-level)
Retraction check advisory

Pass 6 — Visual & Tabular Quality (Checklist §12)

Sequential callout ordering
Resolution and legibility assessment
Colorblind accessibility
Axis labels with units, consistent visual language
Table alignment and significant figure consistency

Pass 7 — Prose Mechanics (Checklist §13, §14, §15)

Tense consistency (recommendations, not strict requirements)
Hedging calibration (neither overclaiming nor vacuous)
Passive voice patterns (advisory)
Nominalization reduction opportunities
Clarity and precision (marketing language advisory for arXiv)
Abbreviation hygiene (first-use expansion, consistency)
Mathematical typesetting consistency

Pass 7b — AI-Pattern Detection (advisory)

Scan prose sections for residual AI-writing patterns using detection rules from

references/detection-patterns.md

. Academic manuscripts drafted or polished with AI assistants often retain detectable tells.

Focus on patterns relevant to academic writing:

Significance inflation (#1) — "pivotal", "groundbreaking", "paradigm shift"
AI-frequency vocabulary (#7) — "delve", "landscape", "tapestry", "underscore"
Copula avoidance (#8) — "serves as" instead of "is"
Vague attributions (#5) — "experts argue", "studies have shown" without citations
Filler phrases (#22) — "it is important to note that"
Excessive hedging (#23) — beyond what epistemically appropriate hedging requires

Skip patterns that are acceptable in academic prose:

Passive voice — standard in methods sections
Formal transitions — "Furthermore", "Moreover" are conventional in academic writing
Title case headings — journal style may require it

This pass is MEDIUM priority. Flag findings but do not over-correct — academic conventions overlap with some AI patterns. Severity: report individual instances as LOW, flag clusters of 3+ patterns in a single paragraph as MEDIUM.

Pass 8 — Best Practices & Reproducibility (Checklist §16, §17, §18, §19)

Supplementary material cross-reference integrity
Code/data availability statements exist in the paper (verification that claimed repos are valid and pipelines work deferred to manuscript-provenance)
License compatibility for third-party assets
Hyperlink verification and reference integrity
Overall clarity and accessibility assessment

Pass 9 — Claims-Evidence Calibration (Checklist §20)

This is a dedicated pass through every assertion in the manuscript.

For each claim:

Grade claim strength: strong/definitive ("X causes Y"), moderate/qualified ("X improves Y under conditions Z"), or hedged/tentative ("X may contribute to Y")
Grade evidence strength: direct experimental, indirect/correlational, citation-only, analogical, or no evidence
Flag mismatches:
- Overclaim: Strong claim + weak evidence → soften the claim or add evidence
- Underclaim: Hedged language + strong evidence → sharpen the language
- Orphaned claim: Any strength + no evidence → add evidence or remove claim
Audit causal vs. correlational language against study design
Check generalization scope against actual experimental conditions
Verify comparative claims ("outperforms", "better than") against head-to-head evaluations actually present in the paper
Flag implicit claims (e.g., "Unlike prior work, our approach handles X" implies prior work cannot — verify this)
Check negation claims for evidence of absence vs. absence of evidence

This pass is HIGH priority. Claims-evidence mismatch is the single most common reason reviewers reject papers. An overclaim in the abstract poisons the entire reading.

Pass 10 — Narrative Flow & Coherence (Checklist §21)

Read the manuscript linearly, tracking the reader's cognitive state. At each sentence and paragraph boundary, check:

Does this sentence follow from the previous one, or does the reader need to make an inferential leap?
Does this paragraph's opening sentence state its point, or is the point buried?
Does each sentence start with known information and end with new information (given-new contract)?
Are cross-references between sentences ordered so the reader moves forward through the text, not zigzagging back?
Does the last sentence of each paragraph connect to the first sentence of the next paragraph?
Are there logic gaps where a premise is skipped because the author knows it implicitly?
Does every setup/promise within a section get its payoff within that section?
Does each section have a discernible arc (setup → content → landing)?

Flag any location where a domain-expert reader would need to re-read, scroll back, or pause to reconstruct the logical connection. These are flow breaks.

This pass is HIGH priority. Papers with strong results but poor narrative flow exhaust reviewers. A reader who has to fight the text stops trusting the author.

Pass 11 — Prose Microstructure (Checklist §22)

Sentence-level and paragraph-level patterns that compound into readability problems:

Ambiguous referents: "this", "it", "they" without clear antecedents
Information density spikes: paragraphs introducing too many new concepts at once
Sentences requiring multiple re-reads: excessive clause nesting, misplaced modifiers, garden-path constructions
Broken parallel structure in lists, comparisons, sequences
Semantic redundancy: same point restated in nearby paragraphs without purpose
Long-distance references: concepts introduced and referenced many paragraphs later without re-anchoring
Dangling modifiers: "Using gradient descent, the loss function converged"

This pass is MEDIUM priority on individual items but compounds — a manuscript with 20 ambiguous pronouns, 10 density spikes, and 5 dangling modifiers is materially harder to read even though no single instance is fatal.

Pass 12 — Rendered Document Inspection (Checklist §23)

This pass requires the compiled PDF. If only LaTeX source is provided, ask the user for the compiled PDF or compile it.

Open the PDF and inspect every page at actual print scale:

Figures: For each figure, zoom to the size it will appear at in the final document. Check:
- All text (axis labels, tick labels, legend, annotations) readable
- No label overlap, collision, or truncation
- Legend placement not covering data
- Annotations pointing to correct elements
Tables: Check column alignment, text wrapping, no content overflow
Floats: For each figure/table, locate its first text reference. Measure the page distance. Flag anything >1 page away.
Page breaks: Check no table splits across pages (unless intentionally long), no equation orphaned from its introduction, no header stranded at page bottom
Margins: Check no content bleeds outside margins (equations, URLs, wide tables, wide figures)
Visual consistency: Font sizes across figures comparable, color usage consistent

This pass is HIGH priority. A paper with illegible axis labels or a table split across pages signals carelessness to reviewers regardless of technical quality. These defects are invisible from source and the author often doesn't notice because they read the paper in their editor, not in the compiled output.

Pass 13 — Cross-Element Coherence (Checklist §24)

Read the manuscript as an integrated system. For each figure, table, and macro-injected value:

Collect the element cluster: The visual/data itself, its caption, every prose passage that references it, and any macro values appearing in or near those passages
Check four-way consistency: Does the prose claim match the visual? Does the caption describe the current content? Do the numbers agree across text, table, and figure? Does the qualitative language match the quantitative values?
Check cross-reference accuracy: Every
```
\ref
```
points to the element the surrounding prose describes. After figure reordering, references often point to the wrong visual.
Check macro-prose coherence: When a macro injects a number, read the sentence it sits in. Does the qualitative framing ("modest", "dramatic", "marginal", "substantial") match the actual numeric value? This is the handoff from manuscript-provenance: provenance traces the value to code, this pass verifies the prose wrapping that value is appropriate.
Check temporal consistency: Do all elements appear to come from the same experimental run? A figure from one run and a table from another is a coherence failure even if both are individually correct.

If a manuscript-provenance report exists, load its macro manifest (list of all traced macro values with locations and source values) and use it as input for step 4. If no provenance report exists, extract macro values directly from

.tex

source.

This pass is HIGH priority. Cross-element incoherence is the most insidious class of manuscript defect — each piece looks fine in isolation, the system is broken. Reviewers notice because they read the document linearly and encounter contradictions the author can't see because they edit pieces independently.

Note for arXiv: Ethics statements, anonymization, page limits, and strict formatting requirements are marked N/A by default. Focus on technical quality, reproducibility, and clarity.

按顺序执行以下审核轮次，每一轮对应一个或多个核查清单章节。系统性推进，针对每个检查点：

通过：简要记录，继续推进
不通过：记录准确位置（章节、段落、行号）、具体缺陷、所需的具体修复方案
不适用：针对该手稿类型无需检查的项标记为不适用

Pass 1 — 结构完整性（核查清单 §1, §4, §5, §10）

追踪从摘要到结论的核心论点脉络
验证章节必要性和逻辑依赖顺序
检查引言的漏斗结构和贡献点枚举
验证结论不包含新信息，且与声明的贡献点1:1对应
评估相关工作的组织方式（分类法 vs 注释式）和差异化说明

Pass 2 — 摘要与标题校准（核查清单 §2, §3）

摘要功能完整性（背景→缺口→方法→结果→意义）
摘要中的量化明确度
标题精度与覆盖范围匹配度
关键词与摘要一致性

Pass 3 — 技术严谨性（核查清单 §6, §7）

方法论的可复现性充足度（文档层面：论文描述的信息是否足够？代码层面验证由
```
manuscript-provenance
```
负责）
假设明确性和符号一致性
基线充足性、数据集特征说明、统计严谨性
效应量报告、评估指标合理性说明
计算成本报告（检查论文是否做了说明；取值追溯到基准测试脚本由
```
manuscript-provenance
```
负责）

Pass 4 — 论证质量（核查清单 §8, §9）

讨论部分未引入新结果
考虑了替代解释
说明了通用性边界
局限性真实（非表面应付），提前回应评审可能提出的异议
覆盖有效性威胁分类

Pass 5 — 引用与参考文献规范（核查清单 §11）

引用与参考文献双向对应（双方都没有孤立项）
符合目标投稿渠道的格式要求
优先引用原始来源而非二次引用
预印本到正式出版的状态检查
引用位置（主张级别，而非段落级别）
撤稿提醒建议

Pass 6 — 可视化与表格质量（核查清单 §12）

按顺序引用
分辨率和可读性评估
色盲可访问性
坐标轴标签带单位，视觉风格一致
表格对齐和有效数字一致性

Pass 7 — 文本规范（核查清单 §13, §14, §15）

时态一致性（建议而非强制要求）
模糊表述校准（既不过度主张也不过于空泛）
被动语态使用模式（建议）
名词化简化机会
清晰度和精度（针对arXiv的营销性语言提醒）
缩写规范（首次出现展开，使用一致）
数学排版一致性

Pass 7b — AI写作模式检测（建议项）

使用

references/detection-patterns.md

中的检测规则，扫描文本部分残留的AI写作特征。由AI助手起草或润色的学术手稿通常会留下可检测的痕迹。

重点关注与学术写作相关的模式：

重要性夸大（#1）——「pivotal」、「groundbreaking」、「paradigm shift」
AI高频词汇（#7）——「delve」、「landscape」、「tapestry」、「underscore」
系动词回避（#8）——用「serves as」代替「is」
模糊归属（#5）——「experts argue」、「studies have shown」无对应引用
填充短语（#22）——「it is important to note that」
过度模糊表述（#23）——超出认知层面合理模糊要求的表述

跳过学术写作中可接受的模式：

被动语态——方法部分的标准用法
正式过渡词——「Furthermore」、「Moreover」是学术写作的常规用法
标题式大小写标题——期刊格式可能要求

本轮为中等优先级。标注发现的问题但不要过度修正——学术惯例与部分AI模式存在重叠。严重程度：单个实例标记为低优先级，单个段落中出现3个及以上模式集群标记为中优先级。

Pass 8 — 最佳实践与可复现性（核查清单 §16, §17, §18, §19）

补充材料交叉引用完整性
论文中存在代码/数据可用性声明（验证声称的仓库是否有效、流水线是否可运行由
```
manuscript-provenance
```
负责）
第三方资产的许可证兼容性
超链接验证和引用完整性
整体清晰度和可访问性评估

Pass 9 — 主张-证据校准（核查清单 §20）

本轮专门审核手稿中的所有断言。

针对每个主张：

主张强度评级：强/确定性（「X导致Y」）、中等/限定性（「X在Z条件下改进Y」）、模糊/试探性（「X可能对Y有贡献」）
证据强度评级：直接实验证据、间接/相关性证据、仅引用、类比、无证据
标记不匹配情况：
- 过度主张：强主张+弱证据→软化主张或补充证据
- 主张不足：模糊表述+强证据→优化表述
- 孤立主张：任何强度+无证据→补充证据或删除主张
对照研究设计审核因果与相关性表述
对照实际实验条件检查 generalization 范围
对照论文中实际存在的直接对比实验，验证比较主张（「outperforms」、「better than」）
标记隐含主张（例如「与现有工作不同，我们的方法可处理X」隐含现有工作无法处理——验证该表述）
检查否定主张是证据不存在还是不存在证据

本轮为高优先级。主张-证据不匹配是评审拒稿的最常见原因。摘要中的过度主张会负面影响整篇论文的阅读印象。

Pass 10 — 叙事流畅度与连贯性（核查清单 §21）

线性阅读手稿，追踪读者的认知状态。在每个句子和段落边界检查：

该句是否承接上一句，还是需要读者做推理跳跃？
该段落的首句是否阐明了段落核心观点，还是核心观点被埋在段落中？
每个句子是否以已知信息开头，以新信息结尾（已知-新信息约定）？
句子间的交叉引用顺序是否让读者向前阅读，而非来回跳转？
每个段落的最后一句是否承接下一段的第一句？
是否存在逻辑缺口，即作者默认知道前提但未明确说明？
章节内的所有铺垫/承诺是否在该章节内得到回应？
每个章节是否有清晰的脉络（铺垫→内容→收尾）？

标记领域专家读者需要重读、回滚或停顿重构逻辑关联的位置，这些就是流畅断点。

本轮为高优先级。结果出色但叙事流畅度差的论文会让评审疲劳。需要费力阅读文本的读者会失去对作者的信任。

Pass 11 — 文本微观结构（核查清单 §22）

累积会导致可读性问题的句子级和段落级模式：

指代模糊：「this」、「it」、「they」无明确先行词
信息密度峰值：段落一次性引入过多新概念
需要多次重读的句子：过多从句嵌套、修饰语错位、花园路径结构
列表、对比、序列中的平行结构破损
语义冗余：相近段落无意义重复同一观点
长距离引用：概念引入后隔了很多段落再引用时未重新说明
悬垂修饰语：「Using gradient descent, the loss function converged」

单个问题为中等优先级，但会累积影响——即使单个问题不致命，一篇有20个模糊代词、10个密度峰值、5个悬垂修饰语的手稿可读性会大幅下降。

Pass 12 — 渲染文档检查（核查清单 §23）

本轮需要编译后的PDF。如果仅提供了LaTeX源码，要求用户提供编译后的PDF或自行编译。

打开PDF，按实际打印尺寸检查每一页：

图：针对每张图，缩放到最终文档中的实际显示尺寸，检查：
- 所有文本（坐标轴标签、刻度标签、图例、注释）可读
- 无标签重叠、碰撞或截断
- 图例位置未覆盖数据
- 注释指向正确元素
表格：检查列对齐、文本换行、无内容溢出
浮动元素：针对每张图/表，找到其首次文本引用位置，计算页面距离，超过1页的标记问题
分页：检查无表格跨页拆分（除非特意设计为长表）、无公式与其说明分离、无标题留在页面底部
边距：检查无内容超出边距（公式、URL、宽表格、宽图）
视觉一致性：各图字号相近、颜色使用一致

本轮为高优先级。无论技术质量如何，坐标轴标签不可读或表格跨页拆分的论文会给评审留下粗心的印象。这些缺陷在源码中不可见，作者通常注意不到，因为他们是在编辑器中阅读论文，而非编译后的输出。

Pass 13 — 跨元素一致性（核查清单 §24）

将手稿作为集成系统阅读。针对每张图、表格和宏注入值：

收集元素集群：可视化/数据本身、其图注、所有引用它的文本段落、以及这些段落中或附近出现的所有宏值
检查四向一致性：文本主张与可视化内容是否匹配？图注描述是否与当前内容一致？文本、表格、图中的数值是否一致？定性表述是否与定量值匹配？
检查交叉引用准确性：每个
```
\ref
```
指向的元素是否与周边文本描述一致。图重排序后，引用经常指向错误的可视化元素
检查宏-文本一致性：当宏注入数值时，阅读其所在句子。定性描述（「modest」、「dramatic」、「marginal」、「substantial」）是否与实际数值匹配？这是与
```
manuscript-provenance
```
的交接点：溯源技能追溯数值到代码，本轮验证包裹数值的文本表述是否合适
检查时间一致性：所有元素是否来自同一次实验运行？即使单个都正确，来自一次运行的图和另一次运行的表也是一致性失效

如果存在

manuscript-provenance

报告，加载其宏观清单（所有可追溯宏值的位置和来源值列表），用作第4步的输入。如果没有溯源报告，直接从

.tex

源文件提取宏值。

本轮为高优先级。跨元素不一致是最隐蔽的手稿缺陷——每个部分单独看都没问题，但整体系统有问题。评审会注意到这类问题，因为他们线性阅读文档，会遇到作者独立编辑各部分时看不到的矛盾。

arXiv注意事项： 伦理声明、匿名化、页数限制和严格格式要求默认标记为不适用。重点关注技术质量、可复现性和清晰度。

4. Generate Refactoring Report

4. 生成重构报告

Produce the report as a structured document. Use

references/report-template.md

as the output format.

text

Read references/report-template.md

Report structure:

Executive Summary — Overall quality assessment (Publication-ready / Recommend revisions / Needs work). Top 5 high-priority improvements.
Per-Section Diagnostics — For each manuscript section, the specific issues found, mapped to checklist checkpoint IDs. Severity tagged as HIGH (impacts clarity/credibility), MEDIUM (noticeable quality gap), or LOW (polish/optional improvement).
Cross-Cutting Issues — Problems that span multiple sections (e.g., inconsistent notation, citation patterns, clarity patterns).
Priority Queue — All issues ranked by impact × effort. HIGH-impact items first, then MEDIUM items ordered by estimated fix effort (lowest effort first = quick wins).
Checklist Status — The full 24-section checklist with pass/needs-work/not-applicable status per checkpoint, referencing specific locations in the manuscript.

生成结构化的报告，使用

references/report-template.md

作为输出格式。

text

Read references/report-template.md

报告结构：

执行摘要——整体质量评估（可发表/建议修改/需要优化）。排名前5的高优先级改进项。
章节级诊断——针对手稿每个章节，列出发现的具体问题，映射到核查清单检查点ID。严重程度标记为高（影响清晰度/可信度）、中（明显质量差距）、低（优化/可选改进）。
跨章节问题——覆盖多个章节的问题（例如符号不一致、引用模式、清晰度问题）。
优先级队列——所有问题按影响×工作量排序。高影响项优先，然后中优先级项按预估修复工作量排序（工作量最低的优先=快速 wins）。
核查清单状态——完整的24节核查清单，每个检查点标注通过/需要优化/不适用状态，关联手稿中的具体位置。

5. Triage and Priority Report

5. 问题分级与优先级报告

After completing the full scan, categorize issues:

HIGH — Impacts technical credibility or reproducibility (missing baselines, orphaned claims, insufficient methodology details, broken references)
MEDIUM — Reduces clarity or professional quality (inconsistent notation, vague claims, poor figure quality)
LOW — Polish issues (citation formatting variations, minor typesetting, style preferences)

For arXiv submissions, focus HIGH priority on technical quality and reproducibility. Compliance items (ethics statements, formatting) are typically LOW priority or N/A.

Present the priority queue first, then the detailed findings.

完成全量扫描后，对问题分类：

高优先级——影响技术可信度或可复现性（缺失基线、孤立主张、方法论细节不足、引用断裂）
中优先级——降低清晰度或专业质量（符号不一致、主张模糊、图质量差）
低优先级——优化问题（引用格式差异、 minor 排版问题、风格偏好）

针对arXiv投稿，高优先级重点关注技术质量和可复现性。合规项（伦理声明、格式）通常为低优先级或不适用。

优先展示优先级队列，然后是详细发现。

6. Output

6. 输出

Save the report as a Markdown file in the same directory as the manuscript, named

[manuscript-name]-review-report.md

Present the file to the user with a concise summary:

Quality assessment verdict
Count of HIGH/MEDIUM/LOW priority items
Top 3 recommended improvements

将报告保存为Markdown文件，存储在手稿所在目录，命名为

[manuscript-name]-review-report.md

。

向用户提供文件和简洁摘要：

质量评估结论
高/中/低优先级项数量
排名前3的推荐改进项

Core Principles

核心原则

Focus on structure and clarity. This is a structural and technical audit. Sentence-level grammar is out of scope unless it forms a systematic pattern affecting readability.
Evidence-based findings. Every issue cites the specific manuscript location (section, paragraph, figure/table number). No vague "could be better."
Balanced severity. HIGH priority for technical credibility and reproducibility issues. MEDIUM for clarity and professional quality. LOW for style preferences. ArXiv allows more flexibility than peer-reviewed venues.
Context-aware recommendations. Formatting and compliance requirements vary by venue. For arXiv, prioritize technical quality over strict formatting. For journal submissions, adjust accordingly.
Constructive framing. Frame findings as improvements to clarity, credibility, and reproducibility rather than as rejection risks. ArXiv is more forgiving; focus on making the work accessible and trustworthy.
Direct communication. Report issues as issues with specific fixes, not as vague suggestions. But recognize that many "rules" are guidelines for arXiv.
Systematic coverage. Work through the checklist methodically. Mark items as pass/needs-work/N/A based on actual content. ArXiv-specific items (anonymization, page limits, strict templates) default to N/A.

聚焦结构和清晰度：本工具是结构和技术审核工具。句子级语法问题不纳入范围，除非形成系统性影响可读性的模式。
基于证据的发现：每个问题都引用具体的手稿位置（章节、段落、图/表编号）。没有模糊的「可以更好」类表述。
合理的严重程度分级：技术可信度和可复现性问题为高优先级。清晰度和专业质量问题为中优先级。风格偏好为低优先级。arXiv比同行评审渠道更灵活。
上下文感知建议：不同投稿渠道的格式和合规要求不同。针对arXiv，优先技术质量而非严格格式。针对期刊投稿可相应调整。
建设性表述：将发现表述为清晰度、可信度和可复现性的改进项，而非拒稿风险。arXiv更包容，重点是让工作更易访问、更可信。
直接沟通：将问题报告为附带具体修复方案的问题，而非模糊建议。但要认识到很多「规则」对arXiv来说只是指导原则。
系统性覆盖：按核查清单有条不紊地推进。根据实际内容标记项为通过/需要优化/不适用。arXiv特有的项（匿名化、页数限制、严格模板）默认标记为不适用。

Example Invocation Patterns

触发示例

User says any of:

"Review my manuscript"
"Check this paper before I submit"
"Is this ready for submission"
"Run pre-publication review"
"Check my references"
"Does the abstract work"
"Review the methodology section"
"Pre-submission checklist"
"/manuscript-review"

All trigger this skill. Partial reviews (e.g., "just check citations") still run the full audit — the user benefits from comprehensive diagnostics even when they only asked about one aspect.

用户说出以下任意内容即可触发：

"Review my manuscript"
"Check this paper before I submit"
"Is this ready for submission"
"Run pre-publication review"
"Check my references"
"Does the abstract work"
"Review the methodology section"
"Pre-submission checklist"
"/manuscript-review"

即使是部分评审请求（例如「仅检查引用」）也会运行全量审核——即使用户仅询问某一方面，全面诊断也能为其带来价值。