无论用户提供的是文案、图片、还是论文PDF,都走同一个流程:
用户输入(文案/图片/论文/需求描述)
↓
① 分析 + 画图指令(识别领域、提取模块、规划布局、选择格式)
↓
② 加载专项规则(从 references/ 按需加载对应图表类型的规则)
↓
③ 生成代码(TikZ .tex 或 draw.io .xml + .html)
↓
④ 编译/预览验证
↓
⑤ 评估打分(必须满分才交付)
↓
⑥ 迭代修复(未满分则回到④,直到 30/30)
↓
⑦ 沉淀经验(如有新发现,追加到 experience-log.md)
↓
交付
步骤①(强制显式输出,不可跳过):阅读输入,提取所有模块/概念/数据流关系。必须以文字形式输出完整的画图指令,禁止"心里想好了直接写代码"。画图指令是后续所有工作的蓝图——模块位置、连线走向、rail 分配、标签防冲突,都在这一步规划清楚。跳过这步直接写代码,就像不画图纸直接盖房子。
画图指令必须包含以下全部要素(缺一不可):
- 领域识别:论文属于什么领域,用什么术语风格
- 格式选择:TikZ 还是 draw.io,为什么
- 布局策略:整图尺寸估算、信息流方向、几行几列、分区方案
- 模块列表:每个模块的名称、颜色、形状、大致位置(第几行第几列)
- 连线逻辑:哪些模块之间有连线、连线类型(数据流/控制流/反馈)、走向(从哪个锚点出发到哪个锚点)
- 空间规划:跨层连线走哪一侧的 rail、多条 rail 如何分配 x 坐标、标签放在连线的哪一侧避免冲突
- 视觉强调:哪些是核心模块(加粗/红色)、哪些是辅助(灰色)
- 可视化嵌入决策(混合图的关键):逐个模块扫描——这个模块有没有适合用迷你可视化表达的信息?判断依据:
| 论文中出现的信息 | 嵌入什么可视化 | 模块框尺寸 |
|---|
| 具体数值对比(准确率、F1、损失值) | 柱状图或横条图 | 加宽至 ≥5cm |
| 注意力机制 / 相关性矩阵 | 热力图(N×N 色块) | 加高至 ≥4cm |
| 时序信号 / 波形 / 频谱 | 波形曲线或频谱柱状图 | 加高至 ≥4cm |
| 分类/聚类结果 | 散点图(带颜色聚类) | 加宽加高 |
| 训练过程 / 收敛曲线 | 双线折线图(train/val) | 加宽至 ≥5cm |
| 空间分布 / 地理数据 | 网格热图(彩色方格) | ≥4cm×4cm |
| 模型对比(多个基线) | 分组柱状图或雷达图 | ≥5cm 宽 |
| 概率分布 / 直方图 | 柱状分布图 | ≥4cm 宽 |
| 无具体数值,纯文字描述 | 不嵌入——用普通框+文字 | 标准尺寸 |
原则:不是每个模块都要嵌入可视化——只在有具体数值或可量化信息时嵌入。纯流程/逻辑模块保持普通框。一张图中嵌入可视化的模块占 30-50% 最佳——全部嵌入太密,全部不嵌入又回到纯框图。嵌入可视化的模块框要比普通框大 1.5-2 倍,给可视化留足空间。
节点形状速查:处理模块→圆角矩形(
)、输入输出→蓝色矩形(
)、判断/约束→菱形(
)、存储→圆柱(
)、求和/聚合→圆形(
)、代码片段→等宽矩形(
)、公式→公式框(
)。
连线类型:核心数据流→粗橙色实线、普通控制流→黑色实线、可选/反馈→虚线、跨区引用→蓝色虚线。语言一致性:整图中英文不混用。
步骤②:根据步骤①确定的图表类型,从
加载对应的专项规则文件。同时读取
references/experience-log.md
获取该类型的已有经验。
步骤③:按规则生成代码。生成前自检:
第一行、color 定义在
前、无未闭合括号、无
中文、
连线数量与画图指令一致(逐条核对,不多不少)。
步骤④:TikZ 编译验证流程:
- 字体可用性检查(编译前必做):运行 确认 CJK 字体存在。按平台优先级选择:macOS → PingFang SC / Heiti SC;Linux → Noto Sans CJK SC;Windows → SimHei / Microsoft YaHei。如果模板中的字体不可用,在编译前替换为本机可用字体,不要等编译后才发现。
- 编译:
xelatex -interaction=nonstopmode
- 编译日志检查(关键):编译后必须
grep "Missing character" *.log
。xelatex 对缺失字体的处理是 warning 而非 error——PDF 仍会生成但中文全部丢失,这是静默失败,不检查 log 会误以为编译成功。
- 转预览图: 转 PNG,检查文字可读性。
draw.io 验证流程:
- XML 合法性:
xmllint --noout file.drawio
- 导出预览图:
drawio -x -f pdf -o output.pdf input.drawio && pdftoppm -png -r 300 output.pdf output-preview
(draw.io CLI 的 PNG 直出在部分环境有兼容问题,PDF 转 PNG 更稳定)
- 如果 命令不可用,提示用户
brew install --cask drawio
安装
- 检查预览图中的文字可读性、布局合理性
步骤⑤:必须查看渲染出的 PNG 图片后再评分——禁止仅凭代码逻辑打分。
加载
references/review-checklist.md
,按其中的视觉审查清单、设计师视角审查、具体检查方向(40 项)、六维度评分逐项执行。总分 30 分且全部审查项通过才交付。
步骤⑥:迭代修复(未满分则回到④,直到 30/30)。
步骤⑦:画图完成后,如果过程中遇到了需要 2 次以上尝试才解决的问题,或发现了有效技巧,追加到
references/experience-log.md
。
Regardless of whether the user provides copy, image, or paper PDF, follow the same process:
用户输入(文案/图片/论文/需求描述)
↓
① 分析 + 画图指令(识别领域、提取模块、规划布局、选择格式)
↓
② 加载专项规则(从 references/ 按需加载对应图表类型的规则)
↓
③ 生成代码(TikZ .tex 或 draw.io .xml + .html)
↓
④ 编译/预览验证
↓
⑤ 评估打分(必须满分才交付)
↓
⑥ 迭代修复(未满分则回到④,直到 30/30)
↓
⑦ 沉淀经验(如有新发现,追加到 experience-log.md)
↓
交付
Step ① (Mandatory explicit output, cannot be skipped): Read the input, extract all module/concept/data flow relationships. Must output the complete drawing instructions in text form, prohibit "writing code directly after thinking it through in your mind". Drawing instructions are the blueprint for all subsequent work - module positions, connection directions, rail allocation, label anti-conflict, all are planned clearly in this step. Skipping this step and writing code directly is like building a house without drawing blueprints.
Drawing instructions must include all the following elements (none can be missing):
- Domain identification: What field does the paper belong to, what terminology style to use
- Format selection: TikZ or draw.io, and why
- Layout strategy: Overall diagram size estimation, information flow direction, number of rows and columns, partition scheme
- Module list: Name, color, shape, approximate position (which row and column) of each module
- Connection logic: Which modules have connections between them, connection type (data flow/control flow/feedback), direction (from which anchor point to which anchor point)
- Space planning: Which side of the rail cross-layer connections take, how to assign x coordinates to multiple rails, which side of the connection the label is placed on to avoid conflicts
- Visual emphasis: Which are the core modules (bold/red), which are auxiliary (gray)
- Visual embedding decision (key for mixed diagrams): Scan each module one by one - does this module have information suitable for expression with mini visualization? Judgment basis:
| Information appearing in the paper | What visualization to embed | Module frame size |
|---|
| Specific value comparison (accuracy, F1, loss value) | Bar chart or horizontal bar chart | Widen to ≥5cm |
| Attention mechanism / correlation matrix | Heat map (N×N color blocks) | Heighten to ≥4cm |
| Time series signal / waveform / spectrum | Waveform curve or spectrum bar chart | Heighten to ≥4cm |
| Classification/clustering results | Scatter plot (with color clustering) | Widen and heighten |
| Training process / convergence curve | Double line chart (train/val) | Widen to ≥5cm |
| Spatial distribution / geographic data | Grid heat map (colored squares) | ≥4cm×4cm |
| Model comparison (multiple baselines) | Grouped bar chart or radar chart | ≥5cm wide |
| Probability distribution / histogram | Column distribution chart | ≥4cm wide |
| No specific values, pure text description | No embedding - use ordinary box + text | Standard size |
Principle: Not every module needs to be embedded with visualization - only embed when there is specific numerical or quantifiable information. Pure process/logic modules keep ordinary boxes. It is best that modules embedded with visualization account for 30-50% of a diagram - all embedding is too dense, no embedding at all returns to pure block diagram. The module box embedded with visualization should be 1.5-2 times larger than the ordinary box to leave enough space for visualization.
Node shape quick check: Processing module → rounded rectangle (
), input/output → blue rectangle (
), judgment/constraint → diamond (
), storage → cylinder (
), summation/aggregation → circle (
), code snippet → monospaced rectangle (
), formula → formula box (
).
Connection type: Core data flow → thick orange solid line, ordinary control flow → black solid line, optional/feedback → dashed line, cross-region reference → blue dashed line. Language consistency: Do not mix Chinese and English in the entire diagram.
Step ②: According to the chart type determined in step ①, load the corresponding special rule file from
. At the same time, read
references/experience-log.md
to obtain existing experience of this type.
Step ③: Generate code according to the rules. Self-check before generation: First line of
, color definition before
, no unclosed brackets, no
Chinese,
number of connections is consistent with drawing instructions (check one by one, no more no less).
Step ④: TikZ compilation verification process:
- Font availability check (mandatory before compilation): Run
fc-list | grep "font name"
to confirm that the CJK font exists. Select according to platform priority: macOS → PingFang SC / Heiti SC; Linux → Noto Sans CJK SC; Windows → SimHei / Microsoft YaHei. If the font in the template is not available, replace it with a font available on the local machine before compilation, do not wait until after compilation to find out.
- Compilation:
xelatex -interaction=nonstopmode
- Compilation log check (key): Must
grep "Missing character" *.log
after compilation. xelatex treats missing fonts as warning instead of error - PDF will still be generated but all Chinese will be lost, this is silent failure, you will mistakenly think the compilation is successful if you do not check the log.
- Convert to preview image: Convert to PNG with , check text readability.
draw.io verification process:
- XML legality:
xmllint --noout file.drawio
- Export preview image:
drawio -x -f pdf -o output.pdf input.drawio && pdftoppm -png -r 300 output.pdf output-preview
(direct PNG export of draw.io CLI has compatibility problems in some environments, PDF to PNG is more stable)
- If the command is not available, prompt the user to install with
brew install --cask drawio
- Check text readability and layout rationality in the preview image
Step ⑤: Must view the rendered PNG image before scoring - prohibit scoring only by code logic.
Load
references/review-checklist.md
, execute item by item according to the visual review checklist, designer perspective review, specific inspection directions (40 items), and six-dimensional scoring. Deliver only when the total score is 30 points and all review items are passed.
Step ⑥: Iterative repair (if not full score, return to ④ until 30/30).
Step ⑦: After the drawing is completed, if you encounter problems that require more than 2 attempts to solve during the process, or find effective skills, add them to
references/experience-log.md
.