academic-plotting
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAcademic Plotting for ML Papers
ML论文学术绘图
Generate publication-quality figures for ML/AI conference papers. Two distinct workflows:
- Diagram figures (architecture, system design, workflows, pipelines) — AI image generation via Gemini
- Data figures (line charts, bar charts, scatter plots, heatmaps, ablations) — matplotlib/seaborn
为ML/AI会议论文生成出版级质量的图表,包含两套独立工作流:
- 示意图类图表(架构图、系统设计、工作流、管线)—— 基于Gemini的AI图像生成
- 数据类图表(折线图、柱状图、散点图、热力图、消融实验图)—— matplotlib/seaborn
When to Use Which Workflow
如何选择对应工作流
| Figure Type | Tool | Why |
|---|---|---|
| Architecture / system diagram | Gemini (Workflow 1) | Complex spatial layouts with boxes, arrows, labels |
| Workflow / pipeline / lifecycle | Gemini (Workflow 1) | Multi-step processes with connections |
| Bar chart, line plot, scatter | matplotlib (Workflow 2) | Precise numerical data, reproducible |
| Heatmap, confusion matrix | matplotlib/seaborn (Workflow 2) | Structured grid data |
| Ablation table as chart | matplotlib (Workflow 2) | Grouped bars or line comparisons |
| Pie / donut chart | matplotlib (Workflow 2) | Proportional data (use sparingly in ML papers) |
| Training curves | matplotlib (Workflow 2) | Loss/accuracy over steps/epochs |
Rule of thumb: If the figure has numerical axes, use matplotlib. If the figure has boxes and arrows, use Gemini.
| 图表类型 | 工具 | 说明 |
|---|---|---|
| 架构 / 系统示意图 | Gemini(工作流1) | 包含方框、箭头、标签的复杂空间布局 |
| 工作流 / 管线 / 生命周期 | Gemini(工作流1) | 存在关联关系的多步骤流程 |
| 柱状图、折线图、散点图 | matplotlib(工作流2) | 精确数值数据,可复现 |
| 热力图、混淆矩阵 | matplotlib/seaborn(工作流2) | 结构化网格数据 |
| 消融实验表格转图表 | matplotlib(工作流2) | 分组柱状图或折线对比 |
| 饼图 / 环形图 | matplotlib(工作流2) | 比例类数据(ML论文中建议谨慎使用) |
| 训练曲线 | matplotlib(工作流2) | 随步骤/epoch变化的损失/准确率 |
经验法则:如果图表带有数值坐标轴,使用matplotlib;如果图表包含方框和箭头,使用Gemini。
Step 0: Context Analysis & Extraction
步骤0:上下文分析与信息提取
The user will typically provide one of these inputs — not a ready-made specification:
| Input Type | Example | What to Extract |
|---|---|---|
| Full paper / section draft | "Here's our method section..." | System components, their relationships, data flow |
| Description paragraph | "Our system has three layers that..." | Key entities, hierarchy, connections |
| Raw results / data table | "MMLU: 85.2, HumanEval: 72.1..." | Metrics, methods, comparison structure |
| CSV / JSON data | Experiment log files | Variables, trends, grouping dimensions |
| Vague request | "Make a figure for the overview" | Read surrounding paper context to infer content |
用户通常会提供以下某一类输入,而非现成的生成规范:
| 输入类型 | 示例 | 提取内容 |
|---|---|---|
| 完整论文 / 章节草稿 | "Here's our method section..." | 系统组件、组件间关系、数据流 |
| 描述段落 | "Our system has three layers that..." | 关键实体、层级结构、连接关系 |
| 原始结果 / 数据表 | "MMLU: 85.2, HumanEval: 72.1..." | 指标、方法、对比结构 |
| CSV / JSON数据 | 实验日志文件 | 变量、趋势、分组维度 |
| 模糊需求 | "Make a figure for the overview" | 读取论文上下文推断内容 |
Extraction Workflow
提取工作流
For diagrams (research context → architecture figure):
- Read the provided context — paper section, abstract, or description paragraph
- Identify visual entities — What are the main components/modules/stages?
- Look for: nouns that represent system parts, named modules, layers, stages
- Count them: if >8 top-level entities, consider grouping into sections
- Identify relationships — How do components connect?
- Look for: verbs describing data flow ("sends to", "queries", "feeds into")
- Classify: data flow (solid arrow), control flow (gray), error path (dashed red)
- Determine layout pattern:
- Sequential pipeline → left-to-right flow
- Layered architecture → horizontal bands stacked vertically
- Hub-and-spoke → central node with radiating connections
- Hierarchical → top-down tree
- Assign colors — One accent color per logical group/layer
- Write every label exactly — Extract exact terminology from the paper text
For data charts (results → figure):
- Read the provided data — table, paragraph with numbers, CSV, or JSON
- Identify dimensions:
- What is being compared? (methods, models, configurations) → categorical axis
- What is the metric? (accuracy, loss, latency, F1) → value axis
- Is there a time/step dimension? → line plot
- Are there multiple metrics? → multi-panel or grouped bars
- Choose chart type automatically using this priority:
- Has a step/time axis → line plot
- Comparing N methods on M benchmarks → grouped bar chart
- Single ranking → horizontal bar (leaderboard)
- Correlation between two continuous variables → scatter plot
- Square matrix of values → heatmap
- Proportional breakdown → stacked bar (avoid pie charts)
- Determine figure sizing — Single column vs full width based on data density
- Highlight "our method" — Identify which entry is the paper's contribution and give it a distinct color
针对示意图(研究上下文 → 架构图):
- 读取提供的上下文 —— 论文章节、摘要或描述段落
- 识别可视化实体 —— 核心组件/模块/阶段有哪些?
- 查找方向:代表系统组成的名词、命名模块、层、阶段
- 数量统计:如果顶层实体超过8个,考虑分组归类
- 识别关联关系 —— 组件之间如何连接?
- 查找方向:描述数据流的动词("发送至"、"查询"、"输入到")
- 分类:数据流(实线箭头)、控制流(灰色)、错误路径(红色虚线)
- 确定布局模式:
- 顺序管线 → 从左到右的流向
- 分层架构 → 垂直堆叠的水平区块
- 中心辐射型 → 中心节点向外发散连接
- 层级型 → 自上而下的树状结构
- 分配颜色 —— 每个逻辑分组/层使用一个强调色
- 精准复刻所有标签 —— 直接提取论文文本中的准确术语
针对数据图表(结果 → 图表):
- 读取提供的数据 —— 表格、带数字的段落、CSV或JSON
- 识别维度:
- 对比对象是什么?(方法、模型、配置)→ 分类轴
- 指标是什么?(准确率、损失、延迟、F1)→ 数值轴
- 是否存在时间/步骤维度?→ 折线图
- 是否有多个指标?→ 多面板图或分组柱状图
- 按照以下优先级自动选择图表类型:
- 存在步骤/时间轴 → 折线图
- 对比N个方法在M个基准测试上的表现 → 分组柱状图
- 单一排名 → 水平柱状图(排行榜样式)
- 两个连续变量的相关性 → 散点图
- 方阵数值 → 热力图
- 比例拆分 → 堆叠柱状图(避免使用饼图)
- 确定图表尺寸 —— 根据数据密度选择单栏或全宽尺寸
- 高亮「我们的方法」 —— 识别属于论文贡献的条目,使用独立颜色区分
Auto-Detection Examples
自动识别示例
Context → Diagram: "Our system has a Planner, Executor, and Verifier. Planner sends plans to Executor, Executor returns results to Verifier, Verifier feeds back to Planner on failure."
→ 3 entities, cycle layout, dashed feedback arrow → Workflow 1 (Gemini)
Data → Chart: "GPT-4: MMLU 86.4, HumanEval 67.0. Ours: 88.1, 71.2. Llama-3: 79.3, 62.1."
→ 3 methods × 2 benchmarks → Workflow 2 (grouped bar), highlight "Ours" in coral
上下文 → 示意图:"Our system has a Planner, Executor, and Verifier. Planner sends plans to Executor, Executor returns results to Verifier, Verifier feeds back to Planner on failure."
→ 3个实体,循环布局,虚线反馈箭头 → 工作流1(Gemini)
数据 → 图表:"GPT-4: MMLU 86.4, HumanEval 67.0. Ours: 88.1, 71.2. Llama-3: 79.3, 62.1."
→ 3个方法 × 2个基准测试 → 工作流2(分组柱状图),使用珊瑚色高亮"Ours"
Workflow 1: Architecture & System Diagrams (AI Image Generation)
工作流1:架构与系统示意图(AI图像生成)
Use Gemini 3 Pro Image Preview to generate diagrams. Choose a visual style first — this is the single biggest factor in whether the figure looks professional or generic.
使用Gemini 3 Pro Image Preview生成示意图。优先选择视觉风格 —— 这是决定图表看起来专业还是通用的最核心因素。
Visual Styles
视觉风格
Pick one style per paper (all figures should be consistent):
每篇论文选择一种风格(所有图表保持统一):
Style A: "Sketch / 简笔画" (Hand-Drawn)
风格A: "Sketch / 简笔画"(手绘风)
Warm, approachable, memorable. Ideal for overview figures and system introductions. Looks like a whiteboard sketch refined by a designer.
VISUAL STYLE — HAND-DRAWN SKETCH:
- Slightly irregular, hand-drawn line quality — lines wobble gently, not perfectly straight
- Rounded, soft shapes with visible pen strokes (like drawn with a thick felt-tip marker)
- Warm off-white background (#FAFAF7), NOT pure white
- Fill colors are soft watercolor-like washes: muted blue (#D6E4F0), soft peach (#F5DEB3),
light sage (#D4E6D4), pale lavender (#E6DFF0)
- Borders are dark charcoal (#2C2C2C) with 2-3px line weight, slightly uneven
- Arrows are hand-drawn with slight curves, ending in simple open arrowheads (not filled triangles)
- Text uses a rounded sans-serif font (like Comic Neue or Architects Daughter feel)
- Small doodle-style icons inside boxes: a tiny gear ⚙ for processing, a lightbulb 💡 for ideas,
a magnifying glass 🔍 for search — rendered as simple line drawings, NOT emoji
- Overall feel: a carefully drawn whiteboard diagram, clean but with personality
- NO clip art, NO stock icons, NO photorealistic elements温暖、易读、记忆点强。非常适合概述图和系统介绍图。看起来像是设计师优化后的白板草图。
VISUAL STYLE — HAND-DRAWN SKETCH:
- Slightly irregular, hand-drawn line quality — lines wobble gently, not perfectly straight
- Rounded, soft shapes with visible pen strokes (like drawn with a thick felt-tip marker)
- Warm off-white background (#FAFAF7), NOT pure white
- Fill colors are soft watercolor-like washes: muted blue (#D6E4F0), soft peach (#F5DEB3),
light sage (#D4E6D4), pale lavender (#E6DFF0)
- Borders are dark charcoal (#2C2C2C) with 2-3px line weight, slightly uneven
- Arrows are hand-drawn with slight curves, ending in simple open arrowheads (not filled triangles)
- Text uses a rounded sans-serif font (like Comic Neue or Architects Daughter feel)
- Small doodle-style icons inside boxes: a tiny gear ⚙ for processing, a lightbulb 💡 for ideas,
a magnifying glass 🔍 for search — rendered as simple line drawings, NOT emoji
- Overall feel: a carefully drawn whiteboard diagram, clean but with personality
- NO clip art, NO stock icons, NO photorealistic elementsStyle B: "Modern Minimal" (Clean & Bold)
风格B: "Modern Minimal"(现代极简风)
Confident, authoritative. Best for method figures where precision matters.
VISUAL STYLE — MODERN MINIMAL:
- Ultra-clean geometric shapes with crisp edges
- Bold color blocks as backgrounds for sections — NOT just accent bars, but full section fills
using desaturated tones: slate blue (#E8EDF2), warm sand (#F5F0E8), cool mint (#E8F2EE)
- Component boxes have ROUNDED CORNERS (12px radius), NO visible border — they float on
the section background using subtle shadow (1px, 4px blur, rgba(0,0,0,0.06))
- ONE accent color per section used sparingly on key elements: Deep blue (#2563EB),
Emerald (#059669), Amber (#D97706), Rose (#E11D48)
- Arrows are thin (1.5px), dark gray (#6B7280), with small filled circle at source
and clean arrowhead at target — NOT thick colored arrows
- Typography: Inter or system sans-serif, title 600 weight, body 400 weight
- Labels INSIDE boxes, not beside them
- Generous whitespace — at least 24px between elements
- NO decorative elements, NO icons — let the structure speak自信、权威。最适合对精度要求高的方法类图表。
VISUAL STYLE — MODERN MINIMAL:
- Ultra-clean geometric shapes with crisp edges
- Bold color blocks as backgrounds for sections — NOT just accent bars, but full section fills
using desaturated tones: slate blue (#E8EDF2), warm sand (#F5F0E8), cool mint (#E8F2EE)
- Component boxes have ROUNDED CORNERS (12px radius), NO visible border — they float on
the section background using subtle shadow (1px, 4px blur, rgba(0,0,0,0.06))
- ONE accent color per section used sparingly on key elements: Deep blue (#2563EB),
Emerald (#059669), Amber (#D97706), Rose (#E11D48)
- Arrows are thin (1.5px), dark gray (#6B7280), with small filled circle at source
and clean arrowhead at target — NOT thick colored arrows
- Typography: Inter or system sans-serif, title 600 weight, body 400 weight
- Labels INSIDE boxes, not beside them
- Generous whitespace — at least 24px between elements
- NO decorative elements, NO icons — let the structure speakStyle C: "Illustrated Technical" (Icon-Rich)
风格C: "Illustrated Technical"(富图标技术风)
Engaging, explanatory. Good for tutorial-style papers and figures that need to be self-explanatory.
VISUAL STYLE — ILLUSTRATED TECHNICAL:
- Each major component has a small MEANINGFUL ICON drawn in a consistent line-art style
(single color, 2px stroke, ~24x24px): brain icon for reasoning, database cylinder for storage,
arrow-loop for iteration, network nodes for communication
- Components sit inside soft rounded rectangles with a LEFT COLOR STRIP (4px wide)
- Background is pure white, but each logical group has a very faint colored region behind it
(#F8FAFC for blue group, #FFF8F0 for orange group)
- Connections use CURVED bezier paths (not straight lines), colored by SOURCE component
- Key data flows are THICKER (3px) than secondary flows (1px, dashed)
- Small annotation badges on arrows: "×N" for repeated operations, "optional" in italics
- Title labels are ABOVE each section in small caps, letter-spaced
- Overall: like a well-designed API documentation diagram有吸引力、解释性强。适合教程类论文和需要自解释的图表。
VISUAL STYLE — ILLUSTRATED TECHNICAL:
- Each major component has a small MEANINGFUL ICON drawn in a consistent line-art style
(single color, 2px stroke, ~24x24px): brain icon for reasoning, database cylinder for storage,
arrow-loop for iteration, network nodes for communication
- Components sit inside soft rounded rectangles with a LEFT COLOR STRIP (4px wide)
- Background is pure white, but each logical group has a very faint colored region behind it
(#F8FAFC for blue group, #FFF8F0 for orange group)
- Connections use CURVED bezier paths (not straight lines), colored by SOURCE component
- Key data flows are THICKER (3px) than secondary flows (1px, dashed)
- Small annotation badges on arrows: "×N" for repeated operations, "optional" in italics
- Title labels are ABOVE each section in small caps, letter-spaced
- Overall: like a well-designed API documentation diagramStyle D: "Accent Bar" (Classic Academic)
风格D: "Accent Bar"(经典学术风)
The default academic style. Safe for any venue, works well in grayscale.
VISUAL STYLE — CLASSIC ACCENT BAR:
- Horizontal section bands stacked vertically, pale gray (#F7F7F5) fill
- Thick colored LEFT ACCENT BAR (8px) distinguishes each section
- Content boxes: white fill, thin #DDD border, 4px rounded corners
- Section palette: Blue #4A90D9, Teal #5BA58B, Amber #D4A252, Slate #7B8794
- Sans-serif typography (Helvetica/Arial), bold titles, regular body
- Colored arrows match their SOURCE section
- Clean, flat, zero decoration默认学术风格。适用于所有会议,灰度显示效果也很好。
VISUAL STYLE — CLASSIC ACCENT BAR:
- Horizontal section bands stacked vertically, pale gray (#F7F7F5) fill
- Thick colored LEFT ACCENT BAR (8px) distinguishes each section
- Content boxes: white fill, thin #DDD border, 4px rounded corners
- Section palette: Blue #4A90D9, Teal #5BA58B, Amber #D4A252, Slate #7B8794
- Sans-serif typography (Helvetica/Arial), bold titles, regular body
- Colored arrows match their SOURCE section
- Clean, flat, zero decorationCurated Color Palettes
精选调色板
"Ocean Dusk" (professional, calming — default recommendation):
deep teal, teal, gold, sandy orange, burnt coral
#264653#2A9D8F#E9C46A#F4A261#E76F51"Ink & Wash" (for 简笔画 style):
charcoal ink, washed blue, washed wheat, washed sage, washed lavender
#2C2C2C#D6E4F0#F5DEB3#D4E6D4#E6DFF0"Nord" (for modern minimal):
polar night, frost blue, aurora green, aurora yellow, aurora red
#2E3440#5E81AC#A3BE8C#EBCB8B#BF616A"Okabe-Ito" (universal colorblind-safe, required for data charts):
orange, sky blue, green, yellow, blue, vermillion, pink
#E69F00#56B4E9#009E73#F0E442#0072B2#D55E00#CC79A7"Ocean Dusk"(专业、沉稳——默认推荐):
deep teal, teal, gold, sandy orange, burnt coral
#264653#2A9D8F#E9C46A#F4A261#E76F51"Ink & Wash"(适用于简笔画风格):
charcoal ink, washed blue, washed wheat, washed sage, washed lavender
#2C2C2C#D6E4F0#F5DEB3#D4E6D4#E6DFF0"Nord"(适用于现代极简风):
polar night, frost blue, aurora green, aurora yellow, aurora red
#2E3440#5E81AC#A3BE8C#EBCB8B#BF616A"Okabe-Ito"(通用色盲友好调色板,数据图表强制使用):
orange, sky blue, green, yellow, blue, vermillion, pink
#E69F00#56B4E9#009E73#F0E442#0072B2#D55E00#CC79A7Checklist
检查清单
- Extract from context: Read paper/description, identify entities and relationships
- Choose visual style (A/B/C/D) — match the paper's tone and venue
- Choose color palette — or use one consistent with existing paper figures
- Obtain Gemini API key (env var)
GEMINI_API_KEY - Write a detailed prompt: style block + layout + connections + constraints
- Generate script at , run for 3 attempts
figures/gen_fig_<name>.py - Review, select best, save as
figures/fig_<name>.png
- 从上下文提取信息:读取论文/描述,识别实体和关联关系
- 选择视觉风格(A/B/C/D)——匹配论文的调性和会议要求
- 选择调色板——或使用与论文现有图表一致的配色
- 获取Gemini API密钥(环境变量)
GEMINI_API_KEY - 编写详细提示词:风格块 + 布局 + 连接关系 + 约束条件
- 生成脚本保存到,运行3次尝试
figures/gen_fig_<name>.py - 审核,选择最优结果,保存为
figures/fig_<name>.png
Prompt Structure (6 Sections)
提示词结构(6个部分)
Every Gemini prompt must include these sections in order:
1. FRAMING (5 lines): "Create a [STYLE_NAME]-style technical diagram for a
[VENUE] paper. The diagram should feel [ADJECTIVES]..."
2. VISUAL STYLE (20-30 lines): Copy the full style block from above (A/B/C/D).
This is the most important section — it determines the entire visual character.
3. COLOR PALETTE (10 lines): Exact hex codes for every color used.
4. LAYOUT (50-150 lines): Every component, box, section — exact text, spatial
arrangement, and grouping. Be exhaustively specific.
5. CONNECTIONS (30-80 lines): Every arrow individually — source, target, style,
label, routing direction.
6. CONSTRAINTS (10 lines): What NOT to include. Adapt per style — e.g., sketch
style allows slight irregularity but still no clip art.所有Gemini提示词必须按顺序包含以下部分:
1. FRAMING (5 lines): "Create a [STYLE_NAME]-style technical diagram for a
[VENUE] paper. The diagram should feel [ADJECTIVES]..."
2. VISUAL STYLE (20-30 lines): Copy the full style block from above (A/B/C/D).
This is the most important section — it determines the entire visual character.
3. COLOR PALETTE (10 lines): Exact hex codes for every color used.
4. LAYOUT (50-150 lines): Every component, box, section — exact text, spatial
arrangement, and grouping. Be exhaustively specific.
5. CONNECTIONS (30-80 lines): Every arrow individually — source, target, style,
label, routing direction.
6. CONSTRAINTS (10 lines): What NOT to include. Adapt per style — e.g., sketch
style allows slight irregularity but still no clip art.Generation Script Template
生成脚本模板
python
#!/usr/bin/env python3
"""Generate [FIGURE_NAME] diagram using Gemini image generation."""
import os, sys, time
from google import genai
API_KEY = os.environ.get("GEMINI_API_KEY")
if not API_KEY:
print("ERROR: Set GEMINI_API_KEY environment variable.")
print(" Get a key at: https://aistudio.google.com/apikey")
sys.exit(1)
MODEL = "gemini-3-pro-image-preview"
OUTPUT_DIR = os.path.dirname(os.path.abspath(__file__))
client = genai.Client(api_key=API_KEY)
PROMPT = """
[PASTE YOUR 6-SECTION PROMPT HERE]
"""
def generate_image(prompt_text, attempt_num):
print(f"\n{'='*60}\nAttempt {attempt_num}\n{'='*60}")
try:
response = client.models.generate_content(
model=MODEL,
contents=prompt_text,
config=genai.types.GenerateContentConfig(
response_modalities=["IMAGE", "TEXT"],
),
)
output_path = os.path.join(OUTPUT_DIR, f"fig_NAME_attempt{attempt_num}.png")
for part in response.candidates[0].content.parts:
if part.inline_data:
with open(output_path, "wb") as f:
f.write(part.inline_data.data)
print(f"Saved: {output_path} ({os.path.getsize(output_path):,} bytes)")
return output_path
elif part.text:
print(f"Text: {part.text[:300]}")
print("WARNING: No image in response")
return None
except Exception as e:
print(f"ERROR: {e}")
return None
def main():
results = []
for i in range(1, 4):
if i > 1:
time.sleep(2)
path = generate_image(PROMPT, i)
if path:
results.append(path)
if not results:
print("All attempts failed!")
sys.exit(1)
print(f"\nGenerated {len(results)} attempts. Review and pick the best.")
if __name__ == "__main__":
main()python
#!/usr/bin/env python3
"""Generate [FIGURE_NAME] diagram using Gemini image generation."""
import os, sys, time
from google import genai
API_KEY = os.environ.get("GEMINI_API_KEY")
if not API_KEY:
print("ERROR: Set GEMINI_API_KEY environment variable.")
print(" Get a key at: https://aistudio.google.com/apikey")
sys.exit(1)
MODEL = "gemini-3-pro-image-preview"
OUTPUT_DIR = os.path.dirname(os.path.abspath(__file__))
client = genai.Client(api_key=API_KEY)
PROMPT = """
[PASTE YOUR 6-SECTION PROMPT HERE]
"""
def generate_image(prompt_text, attempt_num):
print(f"\n{'='*60}\nAttempt {attempt_num}\n{'='*60}")
try:
response = client.models.generate_content(
model=MODEL,
contents=prompt_text,
config=genai.types.GenerateContentConfig(
response_modalities=["IMAGE", "TEXT"],
),
)
output_path = os.path.join(OUTPUT_DIR, f"fig_NAME_attempt{attempt_num}.png")
for part in response.candidates[0].content.parts:
if part.inline_data:
with open(output_path, "wb") as f:
f.write(part.inline_data.data)
print(f"Saved: {output_path} ({os.path.getsize(output_path):,} bytes)")
return output_path
elif part.text:
print(f"Text: {part.text[:300]}")
print("WARNING: No image in response")
return None
except Exception as e:
print(f"ERROR: {e}")
return None
def main():
results = []
for i in range(1, 4):
if i > 1:
time.sleep(2)
path = generate_image(PROMPT, i)
if path:
results.append(path)
if not results:
print("All attempts failed!")
sys.exit(1)
print(f"\nGenerated {len(results)} attempts. Review and pick the best.")
if __name__ == "__main__":
main()Key Rules
核心规则
- Always 3 attempts — quality varies significantly between runs
- Style block is mandatory — without it, Gemini defaults to generic corporate look
- Never hardcode API keys — use
os.environ.get("GEMINI_API_KEY") - Save generation scripts — reproducibility is critical
- Specify every label exactly — Gemini may misspell or rearrange text
Full prompt examples per style: See references/diagram-generation.md
- 始终生成3次尝试 —— 不同运行的质量差异很大
- 风格块是必填项 —— 没有风格块的话,Gemini会默认生成通用企业风图表
- 永远不要硬编码API密钥 —— 使用
os.environ.get("GEMINI_API_KEY") - 保存生成脚本 —— 可复现性至关重要
- 精准指定所有标签 —— Gemini可能会拼写错误或重新排列文本
各风格的完整提示词示例:参见references/diagram-generation.md
Workflow 2: Data-Driven Charts (matplotlib/seaborn)
工作流2:数据驱动图表(matplotlib/seaborn)
For any figure with numerical data, axes, or quantitative comparisons.
适用于所有包含数值数据、坐标轴或定量对比的图表。
Checklist
检查清单
- Extract from context: Parse results/data, identify methods, metrics, and comparison structure
- Auto-select chart type based on data dimensions (see decision guide below)
- Prepare data (CSV, dict, or inline arrays)
- Apply publication styling (fonts, colors, sizes)
- Highlight "our method" with a distinct color
- Export as both PDF (vector) and PNG (300 DPI)
- Verify LaTeX font compatibility
- Save script at
figures/gen_fig_<name>.py
- 从上下文提取信息:解析结果/数据,识别方法、指标和对比结构
- 根据数据维度自动选择图表类型(参见下方决策指南)
- 准备数据(CSV、字典或内联数组)
- 应用出版级样式(字体、颜色、尺寸)
- 使用独立颜色高亮「我们的方法」
- 同时导出PDF(矢量图)和PNG(300 DPI)
- 验证LaTeX字体兼容性
- 保存脚本到
figures/gen_fig_<name>.py
Chart Type Decision Guide
图表类型决策指南
| Data Pattern | Best Chart | Notes |
|---|---|---|
| Trend over time/steps | Line plot | Training curves, scaling laws |
| Comparing categories | Grouped bar chart | Model comparisons, ablations |
| Distribution | Violin / box plot | Score distributions across methods |
| Correlation | Scatter plot | Embedding analysis, metric correlation |
| Grid of values | Heatmap | Attention maps, confusion matrices |
| Part of whole | Stacked bar (not pie) | Prefer stacked bar over pie in ML papers |
| Many methods, one metric | Horizontal bar | Leaderboard-style comparisons |
| 数据模式 | 最佳图表 | 说明 |
|---|---|---|
| 随时间/步骤变化的趋势 | 折线图 | 训练曲线、缩放定律 |
| 类别对比 | 分组柱状图 | 模型对比、消融实验 |
| 分布情况 | 小提琴图/箱线图 | 不同方法的得分分布 |
| 相关性 | 散点图 | 嵌入分析、指标相关性 |
| 网格数值 | 热力图 | 注意力图、混淆矩阵 |
| 占比关系 | 堆叠柱状图(不推荐饼图) | ML论文中优先使用堆叠柱状图而非饼图 |
| 多方法单指标对比 | 水平柱状图 | 排行榜样式对比 |
Publication Styling Template
出版级样式模板
python
import matplotlib.pyplot as plt
import numpy as nppython
import matplotlib.pyplot as plt
import numpy as np--- Publication defaults (polished, not generic) ---
--- Publication defaults (polished, not generic) ---
plt.rcParams.update({
"font.family": "serif", "font.serif": ["Times New Roman", "DejaVu Serif"],
"font.size": 10, "axes.titlesize": 11, "axes.titleweight": "bold",
"axes.labelsize": 10, "legend.fontsize": 8.5, "legend.frameon": False,
"figure.dpi": 300, "savefig.dpi": 300, "savefig.bbox": "tight",
"axes.spines.top": False, "axes.spines.right": False,
"axes.grid": True, "grid.alpha": 0.15, "grid.linestyle": "-",
"lines.linewidth": 1.8, "lines.markersize": 5,
})
plt.rcParams.update({
"font.family": "serif", "font.serif": ["Times New Roman", "DejaVu Serif"],
"font.size": 10, "axes.titlesize": 11, "axes.titleweight": "bold",
"axes.labelsize": 10, "legend.fontsize": 8.5, "legend.frameon": False,
"figure.dpi": 300, "savefig.dpi": 300, "savefig.bbox": "tight",
"axes.spines.top": False, "axes.spines.right": False,
"axes.grid": True, "grid.alpha": 0.15, "grid.linestyle": "-",
"lines.linewidth": 1.8, "lines.markersize": 5,
})
--- "Ocean Dusk" palette (professional, distinctive, colorblind-safe) ---
--- "Ocean Dusk" palette (professional, distinctive, colorblind-safe) ---
COLORS = ["#264653", "#2A9D8F", "#E9C46A", "#F4A261", "#E76F51",
"#0072B2", "#56B4E9", "#8C8C8C"]
OUR_COLOR = "#E76F51" # coral — warm, stands out
BASELINE_COLOR = "#B0BEC5" # cool gray — recedes
FIG_SINGLE, FIG_FULL = (3.25, 2.5), (6.75, 2.8)
undefinedCOLORS = ["#264653", "#2A9D8F", "#E9C46A", "#F4A261", "#E76F51",
"#0072B2", "#56B4E9", "#8C8C8C"]
OUR_COLOR = "#E76F51" # coral — warm, stands out
BASELINE_COLOR = "#B0BEC5" # cool gray — recedes
FIG_SINGLE, FIG_FULL = (3.25, 2.5), (6.75, 2.8)
undefinedCommon Chart Patterns
常见图表模式
Line plot (training curves) — with markers and confidence bands:
python
fig, ax = plt.subplots(figsize=FIG_SINGLE)
markers = ["o", "s", "^", "D", "v"]
for i, (method, (mean, std)) in enumerate(results.items()):
color = OUR_COLOR if method == "Ours" else COLORS[i]
ax.plot(steps, mean, label=method, color=color,
marker=markers[i % 5], markevery=max(1, len(steps)//8),
markersize=4, zorder=3)
ax.fill_between(steps, mean - std, mean + std, color=color, alpha=0.12)
ax.set_xlabel("Training Steps")
ax.set_ylabel("Accuracy (%)")
ax.legend(loc="lower right")
fig.savefig("figures/fig_training.pdf")
fig.savefig("figures/fig_training.png", dpi=300)Grouped bar chart (ablation) — with value labels:
python
fig, ax = plt.subplots(figsize=FIG_FULL)
x = np.arange(len(categories))
n = len(methods)
width = 0.7 / n
for i, (method, scores) in enumerate(methods.items()):
color = OUR_COLOR if method == "Ours" else COLORS[i]
offset = (i - n / 2 + 0.5) * width
bars = ax.bar(x + offset, scores, width * 0.9, label=method, color=color,
edgecolor="white", linewidth=0.5)
for bar, s in zip(bars, scores):
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.3,
f"{s:.1f}", ha="center", va="bottom", fontsize=7, color="#444")
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.set_ylabel("Score")
ax.legend(ncol=min(n, 4))
fig.savefig("figures/fig_ablation.pdf")Heatmap — with diverging colormap and clean borders:
python
import seaborn as sns
fig, ax = plt.subplots(figsize=(4, 3.5))
sns.heatmap(matrix, annot=True, fmt=".2f", cmap="YlOrRd", ax=ax,
cbar_kws={"shrink": 0.75, "aspect": 20},
linewidths=1.5, linecolor="white",
annot_kws={"size": 8, "weight": "medium"})
ax.set_xlabel("Predicted")
ax.set_ylabel("Actual")
fig.savefig("figures/fig_confusion.pdf")Horizontal bar (leaderboard) — with "our method" highlight:
python
fig, ax = plt.subplots(figsize=FIG_SINGLE)
y_pos = np.arange(len(models))
colors = [BASELINE_COLOR] * len(models)
colors[our_idx] = OUR_COLOR
bars = ax.barh(y_pos, scores, color=colors, height=0.55,
edgecolor="white", linewidth=0.5)
ax.set_yticks(y_pos)
ax.set_yticklabels(models)
ax.set_xlabel("Accuracy (%)")
ax.invert_yaxis()
for bar, s in zip(bars, scores):
ax.text(bar.get_width() + 0.3, bar.get_y() + bar.get_height()/2,
f"{s:.1f}", va="center", fontsize=8, color="#444")
fig.savefig("figures/fig_leaderboard.pdf")Full pattern library (scaling laws, violin plots, multi-panel, radar): See references/data-visualization.md
折线图(训练曲线)——带标记点和置信区间:
python
fig, ax = plt.subplots(figsize=FIG_SINGLE)
markers = ["o", "s", "^", "D", "v"]
for i, (method, (mean, std)) in enumerate(results.items()):
color = OUR_COLOR if method == "Ours" else COLORS[i]
ax.plot(steps, mean, label=method, color=color,
marker=markers[i % 5], markevery=max(1, len(steps)//8),
markersize=4, zorder=3)
ax.fill_between(steps, mean - std, mean + std, color=color, alpha=0.12)
ax.set_xlabel("Training Steps")
ax.set_ylabel("Accuracy (%)")
ax.legend(loc="lower right")
fig.savefig("figures/fig_training.pdf")
fig.savefig("figures/fig_training.png", dpi=300)分组柱状图(消融实验)——带数值标签:
python
fig, ax = plt.subplots(figsize=FIG_FULL)
x = np.arange(len(categories))
n = len(methods)
width = 0.7 / n
for i, (method, scores) in enumerate(methods.items()):
color = OUR_COLOR if method == "Ours" else COLORS[i]
offset = (i - n / 2 + 0.5) * width
bars = ax.bar(x + offset, scores, width * 0.9, label=method, color=color,
edgecolor="white", linewidth=0.5)
for bar, s in zip(bars, scores):
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.3,
f"{s:.1f}", ha="center", va="bottom", fontsize=7, color="#444")
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.set_ylabel("Score")
ax.legend(ncol=min(n, 4))
fig.savefig("figures/fig_ablation.pdf")热力图——带发散配色和清晰边框:
python
import seaborn as sns
fig, ax = plt.subplots(figsize=(4, 3.5))
sns.heatmap(matrix, annot=True, fmt=".2f", cmap="YlOrRd", ax=ax,
cbar_kws={"shrink": 0.75, "aspect": 20},
linewidths=1.5, linecolor="white",
annot_kws={"size": 8, "weight": "medium"})
ax.set_xlabel("Predicted")
ax.set_ylabel("Actual")
fig.savefig("figures/fig_confusion.pdf")水平柱状图(排行榜)——高亮「我们的方法」:
python
fig, ax = plt.subplots(figsize=FIG_SINGLE)
y_pos = np.arange(len(models))
colors = [BASELINE_COLOR] * len(models)
colors[our_idx] = OUR_COLOR
bars = ax.barh(y_pos, scores, color=colors, height=0.55,
edgecolor="white", linewidth=0.5)
ax.set_yticks(y_pos)
ax.set_yticklabels(models)
ax.set_xlabel("Accuracy (%)")
ax.invert_yaxis()
for bar, s in zip(bars, scores):
ax.text(bar.get_width() + 0.3, bar.get_y() + bar.get_height()/2,
f"{s:.1f}", va="center", fontsize=8, color="#444")
fig.savefig("figures/fig_leaderboard.pdf")完整模式库(缩放定律、小提琴图、多面板图、雷达图):参见references/data-visualization.md
Publication Style Quick Reference
出版级样式快速参考
| Venue | Single Col | Full Width | Font |
|---|---|---|---|
| NeurIPS | 5.5 in | 5.5 in | Times |
| ICML | 3.25 in | 6.75 in | Times |
| ICLR | 5.5 in | 5.5 in | Times |
| ACL | 3.3 in | 6.8 in | Times |
| AAAI | 3.3 in | 7.0 in | Times |
Always export PDF for vector quality. PNG only for AI-generated diagrams.
Venue-specific details, LaTeX integration, font matching, accessibility checklist: See references/style-guide.md
| 会议 | 单栏宽度 | 全宽宽度 | 字体 |
|---|---|---|---|
| NeurIPS | 5.5 in | 5.5 in | Times |
| ICML | 3.25 in | 6.75 in | Times |
| ICLR | 5.5 in | 5.5 in | Times |
| ACL | 3.3 in | 6.8 in | Times |
| AAAI | 3.3 in | 7.0 in | Times |
始终导出PDF以保证矢量质量。仅AI生成的示意图需要导出PNG。
会议特定细节、LaTeX集成、字体匹配、可访问性检查清单:参见references/style-guide.md
Common Issues
常见问题
| Issue | Solution |
|---|---|
| Fonts look wrong in LaTeX | Export PDF, set |
| Figure too large for column | Check venue width limits, use |
| Colors indistinguishable in print | Use colorblind-safe palette + different line styles/markers |
| Gemini misspells labels | Spell out every label exactly in prompt, add "SPELL EXACTLY" constraint |
| Gemini ignores style | Add more negative constraints, be more specific about hex colors |
| Blurry figures in PDF | Export as PDF (vector), not PNG; or use 300+ DPI for PNG |
| Legend overlaps data | Use |
| Too many tick labels | Use |
| 问题 | 解决方案 |
|---|---|
| LaTeX中字体显示异常 | 导出PDF,设置 |
| 图表超出栏宽 | 检查会议宽度限制,使用英寸为单位设置 |
| 打印时颜色难以区分 | 使用色盲友好调色板+不同的线型/标记点 |
| Gemini拼写错误标签 | 在提示词中精准写出所有标签,添加"SPELL EXACTLY"约束 |
| Gemini忽略样式要求 | 添加更多负向约束,明确指定十六进制颜色码 |
| PDF中图表模糊 | 导出PDF(矢量图)而非PNG;PNG使用300+ DPI |
| 图例覆盖数据 | 使用 |
| 刻度标签过多 | 使用 |
When to Use vs Alternatives
适用场景与替代方案对比
| Need | This Skill | Alternative |
|---|---|---|
| Architecture diagrams | Gemini generation | TikZ (manual), draw.io (interactive), Mermaid (simple) |
| Data charts | matplotlib/seaborn | Plotly (interactive), R/ggplot2 (statistics-heavy) |
| Full paper writing | Use with | — |
| Poster figures | Larger fonts, wider | |
| Presentation figures | Larger text, fewer details | PowerPoint/Keynote export |
| 需求 | 本工具 | 替代方案 |
|---|---|---|
| 架构图 | Gemini生成 | TikZ(手动绘制)、draw.io(交互绘制)、Mermaid(简单场景) |
| 数据图表 | matplotlib/seaborn | Plotly(交互场景)、R/ggplot2(统计密集型场景) |
| 完整论文写作 | 搭配 | — |
| 海报图表 | 更大字体、更宽尺寸 | |
| 演示文稿图表 | 更大字号、更少细节 | PowerPoint/Keynote导出 |
Quick Reference: File Naming Convention
快速参考:文件命名规范
figures/
├── gen_fig_<name>.py # Generation script (always save for reproducibility)
├── fig_<name>.pdf # Final vector output (for LaTeX)
├── fig_<name>.png # Raster output (300 DPI, for AI-generated or fallback)
└── fig_<name>_attempt*.png # Gemini attempts (keep for comparison)figures/
├── gen_fig_<name>.py # Generation script (always save for reproducibility)
├── fig_<name>.pdf # Final vector output (for LaTeX)
├── fig_<name>.png # Raster output (300 DPI, for AI-generated or fallback)
└── fig_<name>_attempt*.png # Gemini attempts (keep for comparison)