academic-plotting

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Academic Plotting for ML Papers

ML论文学术绘图

Generate publication-quality figures for ML/AI conference papers. Two distinct workflows:
  1. Diagram figures (architecture, system design, workflows, pipelines) — AI image generation via Gemini
  2. Data figures (line charts, bar charts, scatter plots, heatmaps, ablations) — matplotlib/seaborn
为ML/AI会议论文生成出版级质量的图表,包含两套独立工作流:
  1. 示意图类图表(架构图、系统设计、工作流、管线)—— 基于Gemini的AI图像生成
  2. 数据类图表(折线图、柱状图、散点图、热力图、消融实验图)—— matplotlib/seaborn

When to Use Which Workflow

如何选择对应工作流

Figure TypeToolWhy
Architecture / system diagramGemini (Workflow 1)Complex spatial layouts with boxes, arrows, labels
Workflow / pipeline / lifecycleGemini (Workflow 1)Multi-step processes with connections
Bar chart, line plot, scattermatplotlib (Workflow 2)Precise numerical data, reproducible
Heatmap, confusion matrixmatplotlib/seaborn (Workflow 2)Structured grid data
Ablation table as chartmatplotlib (Workflow 2)Grouped bars or line comparisons
Pie / donut chartmatplotlib (Workflow 2)Proportional data (use sparingly in ML papers)
Training curvesmatplotlib (Workflow 2)Loss/accuracy over steps/epochs
Rule of thumb: If the figure has numerical axes, use matplotlib. If the figure has boxes and arrows, use Gemini.

图表类型工具说明
架构 / 系统示意图Gemini(工作流1)包含方框、箭头、标签的复杂空间布局
工作流 / 管线 / 生命周期Gemini(工作流1)存在关联关系的多步骤流程
柱状图、折线图、散点图matplotlib(工作流2)精确数值数据,可复现
热力图、混淆矩阵matplotlib/seaborn(工作流2)结构化网格数据
消融实验表格转图表matplotlib(工作流2)分组柱状图或折线对比
饼图 / 环形图matplotlib(工作流2)比例类数据(ML论文中建议谨慎使用)
训练曲线matplotlib(工作流2)随步骤/epoch变化的损失/准确率
经验法则:如果图表带有数值坐标轴,使用matplotlib;如果图表包含方框和箭头,使用Gemini。

Step 0: Context Analysis & Extraction

步骤0:上下文分析与信息提取

The user will typically provide one of these inputs — not a ready-made specification:
Input TypeExampleWhat to Extract
Full paper / section draft"Here's our method section..."System components, their relationships, data flow
Description paragraph"Our system has three layers that..."Key entities, hierarchy, connections
Raw results / data table"MMLU: 85.2, HumanEval: 72.1..."Metrics, methods, comparison structure
CSV / JSON dataExperiment log filesVariables, trends, grouping dimensions
Vague request"Make a figure for the overview"Read surrounding paper context to infer content
用户通常会提供以下某一类输入,而非现成的生成规范:
输入类型示例提取内容
完整论文 / 章节草稿"Here's our method section..."系统组件、组件间关系、数据流
描述段落"Our system has three layers that..."关键实体、层级结构、连接关系
原始结果 / 数据表"MMLU: 85.2, HumanEval: 72.1..."指标、方法、对比结构
CSV / JSON数据实验日志文件变量、趋势、分组维度
模糊需求"Make a figure for the overview"读取论文上下文推断内容

Extraction Workflow

提取工作流

For diagrams (research context → architecture figure):
  1. Read the provided context — paper section, abstract, or description paragraph
  2. Identify visual entities — What are the main components/modules/stages?
    • Look for: nouns that represent system parts, named modules, layers, stages
    • Count them: if >8 top-level entities, consider grouping into sections
  3. Identify relationships — How do components connect?
    • Look for: verbs describing data flow ("sends to", "queries", "feeds into")
    • Classify: data flow (solid arrow), control flow (gray), error path (dashed red)
  4. Determine layout pattern:
    • Sequential pipeline → left-to-right flow
    • Layered architecture → horizontal bands stacked vertically
    • Hub-and-spoke → central node with radiating connections
    • Hierarchical → top-down tree
  5. Assign colors — One accent color per logical group/layer
  6. Write every label exactly — Extract exact terminology from the paper text
For data charts (results → figure):
  1. Read the provided data — table, paragraph with numbers, CSV, or JSON
  2. Identify dimensions:
    • What is being compared? (methods, models, configurations) → categorical axis
    • What is the metric? (accuracy, loss, latency, F1) → value axis
    • Is there a time/step dimension? → line plot
    • Are there multiple metrics? → multi-panel or grouped bars
  3. Choose chart type automatically using this priority:
    • Has a step/time axis → line plot
    • Comparing N methods on M benchmarks → grouped bar chart
    • Single ranking → horizontal bar (leaderboard)
    • Correlation between two continuous variables → scatter plot
    • Square matrix of values → heatmap
    • Proportional breakdown → stacked bar (avoid pie charts)
  4. Determine figure sizing — Single column vs full width based on data density
  5. Highlight "our method" — Identify which entry is the paper's contribution and give it a distinct color
针对示意图(研究上下文 → 架构图):
  1. 读取提供的上下文 —— 论文章节、摘要或描述段落
  2. 识别可视化实体 —— 核心组件/模块/阶段有哪些?
    • 查找方向:代表系统组成的名词、命名模块、层、阶段
    • 数量统计:如果顶层实体超过8个,考虑分组归类
  3. 识别关联关系 —— 组件之间如何连接?
    • 查找方向:描述数据流的动词("发送至"、"查询"、"输入到")
    • 分类:数据流(实线箭头)、控制流(灰色)、错误路径(红色虚线)
  4. 确定布局模式
    • 顺序管线 → 从左到右的流向
    • 分层架构 → 垂直堆叠的水平区块
    • 中心辐射型 → 中心节点向外发散连接
    • 层级型 → 自上而下的树状结构
  5. 分配颜色 —— 每个逻辑分组/层使用一个强调色
  6. 精准复刻所有标签 —— 直接提取论文文本中的准确术语
针对数据图表(结果 → 图表):
  1. 读取提供的数据 —— 表格、带数字的段落、CSV或JSON
  2. 识别维度
    • 对比对象是什么?(方法、模型、配置)→ 分类轴
    • 指标是什么?(准确率、损失、延迟、F1)→ 数值轴
    • 是否存在时间/步骤维度?→ 折线图
    • 是否有多个指标?→ 多面板图或分组柱状图
  3. 按照以下优先级自动选择图表类型
    • 存在步骤/时间轴 → 折线图
    • 对比N个方法在M个基准测试上的表现 → 分组柱状图
    • 单一排名 → 水平柱状图(排行榜样式)
    • 两个连续变量的相关性 → 散点图
    • 方阵数值 → 热力图
    • 比例拆分 → 堆叠柱状图(避免使用饼图)
  4. 确定图表尺寸 —— 根据数据密度选择单栏或全宽尺寸
  5. 高亮「我们的方法」 —— 识别属于论文贡献的条目,使用独立颜色区分

Auto-Detection Examples

自动识别示例

Context → Diagram: "Our system has a Planner, Executor, and Verifier. Planner sends plans to Executor, Executor returns results to Verifier, Verifier feeds back to Planner on failure." → 3 entities, cycle layout, dashed feedback arrow → Workflow 1 (Gemini)
Data → Chart: "GPT-4: MMLU 86.4, HumanEval 67.0. Ours: 88.1, 71.2. Llama-3: 79.3, 62.1." → 3 methods × 2 benchmarks → Workflow 2 (grouped bar), highlight "Ours" in coral

上下文 → 示意图:"Our system has a Planner, Executor, and Verifier. Planner sends plans to Executor, Executor returns results to Verifier, Verifier feeds back to Planner on failure." → 3个实体,循环布局,虚线反馈箭头 → 工作流1(Gemini)
数据 → 图表:"GPT-4: MMLU 86.4, HumanEval 67.0. Ours: 88.1, 71.2. Llama-3: 79.3, 62.1." → 3个方法 × 2个基准测试 → 工作流2(分组柱状图),使用珊瑚色高亮"Ours"

Workflow 1: Architecture & System Diagrams (AI Image Generation)

工作流1:架构与系统示意图(AI图像生成)

Use Gemini 3 Pro Image Preview to generate diagrams. Choose a visual style first — this is the single biggest factor in whether the figure looks professional or generic.
使用Gemini 3 Pro Image Preview生成示意图。优先选择视觉风格 —— 这是决定图表看起来专业还是通用的最核心因素。

Visual Styles

视觉风格

Pick one style per paper (all figures should be consistent):
每篇论文选择一种风格(所有图表保持统一):

Style A: "Sketch / 简笔画" (Hand-Drawn)

风格A: "Sketch / 简笔画"(手绘风)

Warm, approachable, memorable. Ideal for overview figures and system introductions. Looks like a whiteboard sketch refined by a designer.
VISUAL STYLE — HAND-DRAWN SKETCH:
- Slightly irregular, hand-drawn line quality — lines wobble gently, not perfectly straight
- Rounded, soft shapes with visible pen strokes (like drawn with a thick felt-tip marker)
- Warm off-white background (#FAFAF7), NOT pure white
- Fill colors are soft watercolor-like washes: muted blue (#D6E4F0), soft peach (#F5DEB3),
  light sage (#D4E6D4), pale lavender (#E6DFF0)
- Borders are dark charcoal (#2C2C2C) with 2-3px line weight, slightly uneven
- Arrows are hand-drawn with slight curves, ending in simple open arrowheads (not filled triangles)
- Text uses a rounded sans-serif font (like Comic Neue or Architects Daughter feel)
- Small doodle-style icons inside boxes: a tiny gear ⚙ for processing, a lightbulb 💡 for ideas,
  a magnifying glass 🔍 for search — rendered as simple line drawings, NOT emoji
- Overall feel: a carefully drawn whiteboard diagram, clean but with personality
- NO clip art, NO stock icons, NO photorealistic elements
温暖、易读、记忆点强。非常适合概述图和系统介绍图。看起来像是设计师优化后的白板草图。
VISUAL STYLE — HAND-DRAWN SKETCH:
- Slightly irregular, hand-drawn line quality — lines wobble gently, not perfectly straight
- Rounded, soft shapes with visible pen strokes (like drawn with a thick felt-tip marker)
- Warm off-white background (#FAFAF7), NOT pure white
- Fill colors are soft watercolor-like washes: muted blue (#D6E4F0), soft peach (#F5DEB3),
  light sage (#D4E6D4), pale lavender (#E6DFF0)
- Borders are dark charcoal (#2C2C2C) with 2-3px line weight, slightly uneven
- Arrows are hand-drawn with slight curves, ending in simple open arrowheads (not filled triangles)
- Text uses a rounded sans-serif font (like Comic Neue or Architects Daughter feel)
- Small doodle-style icons inside boxes: a tiny gear ⚙ for processing, a lightbulb 💡 for ideas,
  a magnifying glass 🔍 for search — rendered as simple line drawings, NOT emoji
- Overall feel: a carefully drawn whiteboard diagram, clean but with personality
- NO clip art, NO stock icons, NO photorealistic elements

Style B: "Modern Minimal" (Clean & Bold)

风格B: "Modern Minimal"(现代极简风)

Confident, authoritative. Best for method figures where precision matters.
VISUAL STYLE — MODERN MINIMAL:
- Ultra-clean geometric shapes with crisp edges
- Bold color blocks as backgrounds for sections — NOT just accent bars, but full section fills
  using desaturated tones: slate blue (#E8EDF2), warm sand (#F5F0E8), cool mint (#E8F2EE)
- Component boxes have ROUNDED CORNERS (12px radius), NO visible border — they float on
  the section background using subtle shadow (1px, 4px blur, rgba(0,0,0,0.06))
- ONE accent color per section used sparingly on key elements: Deep blue (#2563EB),
  Emerald (#059669), Amber (#D97706), Rose (#E11D48)
- Arrows are thin (1.5px), dark gray (#6B7280), with small filled circle at source
  and clean arrowhead at target — NOT thick colored arrows
- Typography: Inter or system sans-serif, title 600 weight, body 400 weight
- Labels INSIDE boxes, not beside them
- Generous whitespace — at least 24px between elements
- NO decorative elements, NO icons — let the structure speak
自信、权威。最适合对精度要求高的方法类图表。
VISUAL STYLE — MODERN MINIMAL:
- Ultra-clean geometric shapes with crisp edges
- Bold color blocks as backgrounds for sections — NOT just accent bars, but full section fills
  using desaturated tones: slate blue (#E8EDF2), warm sand (#F5F0E8), cool mint (#E8F2EE)
- Component boxes have ROUNDED CORNERS (12px radius), NO visible border — they float on
  the section background using subtle shadow (1px, 4px blur, rgba(0,0,0,0.06))
- ONE accent color per section used sparingly on key elements: Deep blue (#2563EB),
  Emerald (#059669), Amber (#D97706), Rose (#E11D48)
- Arrows are thin (1.5px), dark gray (#6B7280), with small filled circle at source
  and clean arrowhead at target — NOT thick colored arrows
- Typography: Inter or system sans-serif, title 600 weight, body 400 weight
- Labels INSIDE boxes, not beside them
- Generous whitespace — at least 24px between elements
- NO decorative elements, NO icons — let the structure speak

Style C: "Illustrated Technical" (Icon-Rich)

风格C: "Illustrated Technical"(富图标技术风)

Engaging, explanatory. Good for tutorial-style papers and figures that need to be self-explanatory.
VISUAL STYLE — ILLUSTRATED TECHNICAL:
- Each major component has a small MEANINGFUL ICON drawn in a consistent line-art style
  (single color, 2px stroke, ~24x24px): brain icon for reasoning, database cylinder for storage,
  arrow-loop for iteration, network nodes for communication
- Components sit inside soft rounded rectangles with a LEFT COLOR STRIP (4px wide)
- Background is pure white, but each logical group has a very faint colored region behind it
  (#F8FAFC for blue group, #FFF8F0 for orange group)
- Connections use CURVED bezier paths (not straight lines), colored by SOURCE component
- Key data flows are THICKER (3px) than secondary flows (1px, dashed)
- Small annotation badges on arrows: "×N" for repeated operations, "optional" in italics
- Title labels are ABOVE each section in small caps, letter-spaced
- Overall: like a well-designed API documentation diagram
有吸引力、解释性强。适合教程类论文和需要自解释的图表。
VISUAL STYLE — ILLUSTRATED TECHNICAL:
- Each major component has a small MEANINGFUL ICON drawn in a consistent line-art style
  (single color, 2px stroke, ~24x24px): brain icon for reasoning, database cylinder for storage,
  arrow-loop for iteration, network nodes for communication
- Components sit inside soft rounded rectangles with a LEFT COLOR STRIP (4px wide)
- Background is pure white, but each logical group has a very faint colored region behind it
  (#F8FAFC for blue group, #FFF8F0 for orange group)
- Connections use CURVED bezier paths (not straight lines), colored by SOURCE component
- Key data flows are THICKER (3px) than secondary flows (1px, dashed)
- Small annotation badges on arrows: "×N" for repeated operations, "optional" in italics
- Title labels are ABOVE each section in small caps, letter-spaced
- Overall: like a well-designed API documentation diagram

Style D: "Accent Bar" (Classic Academic)

风格D: "Accent Bar"(经典学术风)

The default academic style. Safe for any venue, works well in grayscale.
VISUAL STYLE — CLASSIC ACCENT BAR:
- Horizontal section bands stacked vertically, pale gray (#F7F7F5) fill
- Thick colored LEFT ACCENT BAR (8px) distinguishes each section
- Content boxes: white fill, thin #DDD border, 4px rounded corners
- Section palette: Blue #4A90D9, Teal #5BA58B, Amber #D4A252, Slate #7B8794
- Sans-serif typography (Helvetica/Arial), bold titles, regular body
- Colored arrows match their SOURCE section
- Clean, flat, zero decoration
默认学术风格。适用于所有会议,灰度显示效果也很好。
VISUAL STYLE — CLASSIC ACCENT BAR:
- Horizontal section bands stacked vertically, pale gray (#F7F7F5) fill
- Thick colored LEFT ACCENT BAR (8px) distinguishes each section
- Content boxes: white fill, thin #DDD border, 4px rounded corners
- Section palette: Blue #4A90D9, Teal #5BA58B, Amber #D4A252, Slate #7B8794
- Sans-serif typography (Helvetica/Arial), bold titles, regular body
- Colored arrows match their SOURCE section
- Clean, flat, zero decoration

Curated Color Palettes

精选调色板

"Ocean Dusk" (professional, calming — default recommendation):
#264653
deep teal,
#2A9D8F
teal,
#E9C46A
gold,
#F4A261
sandy orange,
#E76F51
burnt coral
"Ink & Wash" (for 简笔画 style):
#2C2C2C
charcoal ink,
#D6E4F0
washed blue,
#F5DEB3
washed wheat,
#D4E6D4
washed sage,
#E6DFF0
washed lavender
"Nord" (for modern minimal):
#2E3440
polar night,
#5E81AC
frost blue,
#A3BE8C
aurora green,
#EBCB8B
aurora yellow,
#BF616A
aurora red
"Okabe-Ito" (universal colorblind-safe, required for data charts):
#E69F00
orange,
#56B4E9
sky blue,
#009E73
green,
#F0E442
yellow,
#0072B2
blue,
#D55E00
vermillion,
#CC79A7
pink
"Ocean Dusk"(专业、沉稳——默认推荐):
#264653
deep teal,
#2A9D8F
teal,
#E9C46A
gold,
#F4A261
sandy orange,
#E76F51
burnt coral
"Ink & Wash"(适用于简笔画风格):
#2C2C2C
charcoal ink,
#D6E4F0
washed blue,
#F5DEB3
washed wheat,
#D4E6D4
washed sage,
#E6DFF0
washed lavender
"Nord"(适用于现代极简风):
#2E3440
polar night,
#5E81AC
frost blue,
#A3BE8C
aurora green,
#EBCB8B
aurora yellow,
#BF616A
aurora red
"Okabe-Ito"(通用色盲友好调色板,数据图表强制使用):
#E69F00
orange,
#56B4E9
sky blue,
#009E73
green,
#F0E442
yellow,
#0072B2
blue,
#D55E00
vermillion,
#CC79A7
pink

Checklist

检查清单

  • Extract from context: Read paper/description, identify entities and relationships
  • Choose visual style (A/B/C/D) — match the paper's tone and venue
  • Choose color palette — or use one consistent with existing paper figures
  • Obtain Gemini API key (
    GEMINI_API_KEY
    env var)
  • Write a detailed prompt: style block + layout + connections + constraints
  • Generate script at
    figures/gen_fig_<name>.py
    , run for 3 attempts
  • Review, select best, save as
    figures/fig_<name>.png
  • 从上下文提取信息:读取论文/描述,识别实体和关联关系
  • 选择视觉风格(A/B/C/D)——匹配论文的调性和会议要求
  • 选择调色板——或使用与论文现有图表一致的配色
  • 获取Gemini API密钥(
    GEMINI_API_KEY
    环境变量)
  • 编写详细提示词:风格块 + 布局 + 连接关系 + 约束条件
  • 生成脚本保存到
    figures/gen_fig_<name>.py
    ,运行3次尝试
  • 审核,选择最优结果,保存为
    figures/fig_<name>.png

Prompt Structure (6 Sections)

提示词结构(6个部分)

Every Gemini prompt must include these sections in order:
1. FRAMING (5 lines): "Create a [STYLE_NAME]-style technical diagram for a
   [VENUE] paper. The diagram should feel [ADJECTIVES]..."

2. VISUAL STYLE (20-30 lines): Copy the full style block from above (A/B/C/D).
   This is the most important section — it determines the entire visual character.

3. COLOR PALETTE (10 lines): Exact hex codes for every color used.

4. LAYOUT (50-150 lines): Every component, box, section — exact text, spatial
   arrangement, and grouping. Be exhaustively specific.

5. CONNECTIONS (30-80 lines): Every arrow individually — source, target, style,
   label, routing direction.

6. CONSTRAINTS (10 lines): What NOT to include. Adapt per style — e.g., sketch
   style allows slight irregularity but still no clip art.
所有Gemini提示词必须按顺序包含以下部分:
1. FRAMING (5 lines): "Create a [STYLE_NAME]-style technical diagram for a
   [VENUE] paper. The diagram should feel [ADJECTIVES]..."

2. VISUAL STYLE (20-30 lines): Copy the full style block from above (A/B/C/D).
   This is the most important section — it determines the entire visual character.

3. COLOR PALETTE (10 lines): Exact hex codes for every color used.

4. LAYOUT (50-150 lines): Every component, box, section — exact text, spatial
   arrangement, and grouping. Be exhaustively specific.

5. CONNECTIONS (30-80 lines): Every arrow individually — source, target, style,
   label, routing direction.

6. CONSTRAINTS (10 lines): What NOT to include. Adapt per style — e.g., sketch
   style allows slight irregularity but still no clip art.

Generation Script Template

生成脚本模板

python
#!/usr/bin/env python3
"""Generate [FIGURE_NAME] diagram using Gemini image generation."""
import os, sys, time
from google import genai

API_KEY = os.environ.get("GEMINI_API_KEY")
if not API_KEY:
    print("ERROR: Set GEMINI_API_KEY environment variable.")
    print("  Get a key at: https://aistudio.google.com/apikey")
    sys.exit(1)

MODEL = "gemini-3-pro-image-preview"
OUTPUT_DIR = os.path.dirname(os.path.abspath(__file__))
client = genai.Client(api_key=API_KEY)

PROMPT = """
[PASTE YOUR 6-SECTION PROMPT HERE]
"""

def generate_image(prompt_text, attempt_num):
    print(f"\n{'='*60}\nAttempt {attempt_num}\n{'='*60}")
    try:
        response = client.models.generate_content(
            model=MODEL,
            contents=prompt_text,
            config=genai.types.GenerateContentConfig(
                response_modalities=["IMAGE", "TEXT"],
            ),
        )
        output_path = os.path.join(OUTPUT_DIR, f"fig_NAME_attempt{attempt_num}.png")
        for part in response.candidates[0].content.parts:
            if part.inline_data:
                with open(output_path, "wb") as f:
                    f.write(part.inline_data.data)
                print(f"Saved: {output_path} ({os.path.getsize(output_path):,} bytes)")
                return output_path
            elif part.text:
                print(f"Text: {part.text[:300]}")
        print("WARNING: No image in response")
        return None
    except Exception as e:
        print(f"ERROR: {e}")
        return None

def main():
    results = []
    for i in range(1, 4):
        if i > 1:
            time.sleep(2)
        path = generate_image(PROMPT, i)
        if path:
            results.append(path)
    if not results:
        print("All attempts failed!")
        sys.exit(1)
    print(f"\nGenerated {len(results)} attempts. Review and pick the best.")

if __name__ == "__main__":
    main()
python
#!/usr/bin/env python3
"""Generate [FIGURE_NAME] diagram using Gemini image generation."""
import os, sys, time
from google import genai

API_KEY = os.environ.get("GEMINI_API_KEY")
if not API_KEY:
    print("ERROR: Set GEMINI_API_KEY environment variable.")
    print("  Get a key at: https://aistudio.google.com/apikey")
    sys.exit(1)

MODEL = "gemini-3-pro-image-preview"
OUTPUT_DIR = os.path.dirname(os.path.abspath(__file__))
client = genai.Client(api_key=API_KEY)

PROMPT = """
[PASTE YOUR 6-SECTION PROMPT HERE]
"""

def generate_image(prompt_text, attempt_num):
    print(f"\n{'='*60}\nAttempt {attempt_num}\n{'='*60}")
    try:
        response = client.models.generate_content(
            model=MODEL,
            contents=prompt_text,
            config=genai.types.GenerateContentConfig(
                response_modalities=["IMAGE", "TEXT"],
            ),
        )
        output_path = os.path.join(OUTPUT_DIR, f"fig_NAME_attempt{attempt_num}.png")
        for part in response.candidates[0].content.parts:
            if part.inline_data:
                with open(output_path, "wb") as f:
                    f.write(part.inline_data.data)
                print(f"Saved: {output_path} ({os.path.getsize(output_path):,} bytes)")
                return output_path
            elif part.text:
                print(f"Text: {part.text[:300]}")
        print("WARNING: No image in response")
        return None
    except Exception as e:
        print(f"ERROR: {e}")
        return None

def main():
    results = []
    for i in range(1, 4):
        if i > 1:
            time.sleep(2)
        path = generate_image(PROMPT, i)
        if path:
            results.append(path)
    if not results:
        print("All attempts failed!")
        sys.exit(1)
    print(f"\nGenerated {len(results)} attempts. Review and pick the best.")

if __name__ == "__main__":
    main()

Key Rules

核心规则

  • Always 3 attempts — quality varies significantly between runs
  • Style block is mandatory — without it, Gemini defaults to generic corporate look
  • Never hardcode API keys — use
    os.environ.get("GEMINI_API_KEY")
  • Save generation scripts — reproducibility is critical
  • Specify every label exactly — Gemini may misspell or rearrange text
Full prompt examples per style: See references/diagram-generation.md

  • 始终生成3次尝试 —— 不同运行的质量差异很大
  • 风格块是必填项 —— 没有风格块的话,Gemini会默认生成通用企业风图表
  • 永远不要硬编码API密钥 —— 使用
    os.environ.get("GEMINI_API_KEY")
  • 保存生成脚本 —— 可复现性至关重要
  • 精准指定所有标签 —— Gemini可能会拼写错误或重新排列文本
各风格的完整提示词示例:参见references/diagram-generation.md

Workflow 2: Data-Driven Charts (matplotlib/seaborn)

工作流2:数据驱动图表(matplotlib/seaborn)

For any figure with numerical data, axes, or quantitative comparisons.
适用于所有包含数值数据、坐标轴或定量对比的图表。

Checklist

检查清单

  • Extract from context: Parse results/data, identify methods, metrics, and comparison structure
  • Auto-select chart type based on data dimensions (see decision guide below)
  • Prepare data (CSV, dict, or inline arrays)
  • Apply publication styling (fonts, colors, sizes)
  • Highlight "our method" with a distinct color
  • Export as both PDF (vector) and PNG (300 DPI)
  • Verify LaTeX font compatibility
  • Save script at
    figures/gen_fig_<name>.py
  • 从上下文提取信息:解析结果/数据,识别方法、指标和对比结构
  • 根据数据维度自动选择图表类型(参见下方决策指南)
  • 准备数据(CSV、字典或内联数组)
  • 应用出版级样式(字体、颜色、尺寸)
  • 使用独立颜色高亮「我们的方法」
  • 同时导出PDF(矢量图)和PNG(300 DPI)
  • 验证LaTeX字体兼容性
  • 保存脚本到
    figures/gen_fig_<name>.py

Chart Type Decision Guide

图表类型决策指南

Data PatternBest ChartNotes
Trend over time/stepsLine plotTraining curves, scaling laws
Comparing categoriesGrouped bar chartModel comparisons, ablations
DistributionViolin / box plotScore distributions across methods
CorrelationScatter plotEmbedding analysis, metric correlation
Grid of valuesHeatmapAttention maps, confusion matrices
Part of wholeStacked bar (not pie)Prefer stacked bar over pie in ML papers
Many methods, one metricHorizontal barLeaderboard-style comparisons
数据模式最佳图表说明
随时间/步骤变化的趋势折线图训练曲线、缩放定律
类别对比分组柱状图模型对比、消融实验
分布情况小提琴图/箱线图不同方法的得分分布
相关性散点图嵌入分析、指标相关性
网格数值热力图注意力图、混淆矩阵
占比关系堆叠柱状图(不推荐饼图)ML论文中优先使用堆叠柱状图而非饼图
多方法单指标对比水平柱状图排行榜样式对比

Publication Styling Template

出版级样式模板

python
import matplotlib.pyplot as plt
import numpy as np
python
import matplotlib.pyplot as plt
import numpy as np

--- Publication defaults (polished, not generic) ---

--- Publication defaults (polished, not generic) ---

plt.rcParams.update({ "font.family": "serif", "font.serif": ["Times New Roman", "DejaVu Serif"], "font.size": 10, "axes.titlesize": 11, "axes.titleweight": "bold", "axes.labelsize": 10, "legend.fontsize": 8.5, "legend.frameon": False, "figure.dpi": 300, "savefig.dpi": 300, "savefig.bbox": "tight", "axes.spines.top": False, "axes.spines.right": False, "axes.grid": True, "grid.alpha": 0.15, "grid.linestyle": "-", "lines.linewidth": 1.8, "lines.markersize": 5, })
plt.rcParams.update({ "font.family": "serif", "font.serif": ["Times New Roman", "DejaVu Serif"], "font.size": 10, "axes.titlesize": 11, "axes.titleweight": "bold", "axes.labelsize": 10, "legend.fontsize": 8.5, "legend.frameon": False, "figure.dpi": 300, "savefig.dpi": 300, "savefig.bbox": "tight", "axes.spines.top": False, "axes.spines.right": False, "axes.grid": True, "grid.alpha": 0.15, "grid.linestyle": "-", "lines.linewidth": 1.8, "lines.markersize": 5, })

--- "Ocean Dusk" palette (professional, distinctive, colorblind-safe) ---

--- "Ocean Dusk" palette (professional, distinctive, colorblind-safe) ---

COLORS = ["#264653", "#2A9D8F", "#E9C46A", "#F4A261", "#E76F51", "#0072B2", "#56B4E9", "#8C8C8C"] OUR_COLOR = "#E76F51" # coral — warm, stands out BASELINE_COLOR = "#B0BEC5" # cool gray — recedes FIG_SINGLE, FIG_FULL = (3.25, 2.5), (6.75, 2.8)
undefined
COLORS = ["#264653", "#2A9D8F", "#E9C46A", "#F4A261", "#E76F51", "#0072B2", "#56B4E9", "#8C8C8C"] OUR_COLOR = "#E76F51" # coral — warm, stands out BASELINE_COLOR = "#B0BEC5" # cool gray — recedes FIG_SINGLE, FIG_FULL = (3.25, 2.5), (6.75, 2.8)
undefined

Common Chart Patterns

常见图表模式

Line plot (training curves) — with markers and confidence bands:
python
fig, ax = plt.subplots(figsize=FIG_SINGLE)
markers = ["o", "s", "^", "D", "v"]
for i, (method, (mean, std)) in enumerate(results.items()):
    color = OUR_COLOR if method == "Ours" else COLORS[i]
    ax.plot(steps, mean, label=method, color=color,
            marker=markers[i % 5], markevery=max(1, len(steps)//8),
            markersize=4, zorder=3)
    ax.fill_between(steps, mean - std, mean + std, color=color, alpha=0.12)
ax.set_xlabel("Training Steps")
ax.set_ylabel("Accuracy (%)")
ax.legend(loc="lower right")
fig.savefig("figures/fig_training.pdf")
fig.savefig("figures/fig_training.png", dpi=300)
Grouped bar chart (ablation) — with value labels:
python
fig, ax = plt.subplots(figsize=FIG_FULL)
x = np.arange(len(categories))
n = len(methods)
width = 0.7 / n
for i, (method, scores) in enumerate(methods.items()):
    color = OUR_COLOR if method == "Ours" else COLORS[i]
    offset = (i - n / 2 + 0.5) * width
    bars = ax.bar(x + offset, scores, width * 0.9, label=method, color=color,
                  edgecolor="white", linewidth=0.5)
    for bar, s in zip(bars, scores):
        ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.3,
                f"{s:.1f}", ha="center", va="bottom", fontsize=7, color="#444")
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.set_ylabel("Score")
ax.legend(ncol=min(n, 4))
fig.savefig("figures/fig_ablation.pdf")
Heatmap — with diverging colormap and clean borders:
python
import seaborn as sns
fig, ax = plt.subplots(figsize=(4, 3.5))
sns.heatmap(matrix, annot=True, fmt=".2f", cmap="YlOrRd", ax=ax,
            cbar_kws={"shrink": 0.75, "aspect": 20},
            linewidths=1.5, linecolor="white",
            annot_kws={"size": 8, "weight": "medium"})
ax.set_xlabel("Predicted")
ax.set_ylabel("Actual")
fig.savefig("figures/fig_confusion.pdf")
Horizontal bar (leaderboard) — with "our method" highlight:
python
fig, ax = plt.subplots(figsize=FIG_SINGLE)
y_pos = np.arange(len(models))
colors = [BASELINE_COLOR] * len(models)
colors[our_idx] = OUR_COLOR
bars = ax.barh(y_pos, scores, color=colors, height=0.55,
               edgecolor="white", linewidth=0.5)
ax.set_yticks(y_pos)
ax.set_yticklabels(models)
ax.set_xlabel("Accuracy (%)")
ax.invert_yaxis()
for bar, s in zip(bars, scores):
    ax.text(bar.get_width() + 0.3, bar.get_y() + bar.get_height()/2,
            f"{s:.1f}", va="center", fontsize=8, color="#444")
fig.savefig("figures/fig_leaderboard.pdf")
Full pattern library (scaling laws, violin plots, multi-panel, radar): See references/data-visualization.md

折线图(训练曲线)——带标记点和置信区间:
python
fig, ax = plt.subplots(figsize=FIG_SINGLE)
markers = ["o", "s", "^", "D", "v"]
for i, (method, (mean, std)) in enumerate(results.items()):
    color = OUR_COLOR if method == "Ours" else COLORS[i]
    ax.plot(steps, mean, label=method, color=color,
            marker=markers[i % 5], markevery=max(1, len(steps)//8),
            markersize=4, zorder=3)
    ax.fill_between(steps, mean - std, mean + std, color=color, alpha=0.12)
ax.set_xlabel("Training Steps")
ax.set_ylabel("Accuracy (%)")
ax.legend(loc="lower right")
fig.savefig("figures/fig_training.pdf")
fig.savefig("figures/fig_training.png", dpi=300)
分组柱状图(消融实验)——带数值标签:
python
fig, ax = plt.subplots(figsize=FIG_FULL)
x = np.arange(len(categories))
n = len(methods)
width = 0.7 / n
for i, (method, scores) in enumerate(methods.items()):
    color = OUR_COLOR if method == "Ours" else COLORS[i]
    offset = (i - n / 2 + 0.5) * width
    bars = ax.bar(x + offset, scores, width * 0.9, label=method, color=color,
                  edgecolor="white", linewidth=0.5)
    for bar, s in zip(bars, scores):
        ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.3,
                f"{s:.1f}", ha="center", va="bottom", fontsize=7, color="#444")
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.set_ylabel("Score")
ax.legend(ncol=min(n, 4))
fig.savefig("figures/fig_ablation.pdf")
热力图——带发散配色和清晰边框:
python
import seaborn as sns
fig, ax = plt.subplots(figsize=(4, 3.5))
sns.heatmap(matrix, annot=True, fmt=".2f", cmap="YlOrRd", ax=ax,
            cbar_kws={"shrink": 0.75, "aspect": 20},
            linewidths=1.5, linecolor="white",
            annot_kws={"size": 8, "weight": "medium"})
ax.set_xlabel("Predicted")
ax.set_ylabel("Actual")
fig.savefig("figures/fig_confusion.pdf")
水平柱状图(排行榜)——高亮「我们的方法」:
python
fig, ax = plt.subplots(figsize=FIG_SINGLE)
y_pos = np.arange(len(models))
colors = [BASELINE_COLOR] * len(models)
colors[our_idx] = OUR_COLOR
bars = ax.barh(y_pos, scores, color=colors, height=0.55,
               edgecolor="white", linewidth=0.5)
ax.set_yticks(y_pos)
ax.set_yticklabels(models)
ax.set_xlabel("Accuracy (%)")
ax.invert_yaxis()
for bar, s in zip(bars, scores):
    ax.text(bar.get_width() + 0.3, bar.get_y() + bar.get_height()/2,
            f"{s:.1f}", va="center", fontsize=8, color="#444")
fig.savefig("figures/fig_leaderboard.pdf")
完整模式库(缩放定律、小提琴图、多面板图、雷达图):参见references/data-visualization.md

Publication Style Quick Reference

出版级样式快速参考

VenueSingle ColFull WidthFont
NeurIPS5.5 in5.5 inTimes
ICML3.25 in6.75 inTimes
ICLR5.5 in5.5 inTimes
ACL3.3 in6.8 inTimes
AAAI3.3 in7.0 inTimes
Always export PDF for vector quality. PNG only for AI-generated diagrams.
Venue-specific details, LaTeX integration, font matching, accessibility checklist: See references/style-guide.md

会议单栏宽度全宽宽度字体
NeurIPS5.5 in5.5 inTimes
ICML3.25 in6.75 inTimes
ICLR5.5 in5.5 inTimes
ACL3.3 in6.8 inTimes
AAAI3.3 in7.0 inTimes
始终导出PDF以保证矢量质量。仅AI生成的示意图需要导出PNG。
会议特定细节、LaTeX集成、字体匹配、可访问性检查清单:参见references/style-guide.md

Common Issues

常见问题

IssueSolution
Fonts look wrong in LaTeXExport PDF, set
text.usetex=True
, or use
font.family=serif
Figure too large for columnCheck venue width limits, use
figsize
in inches
Colors indistinguishable in printUse colorblind-safe palette + different line styles/markers
Gemini misspells labelsSpell out every label exactly in prompt, add "SPELL EXACTLY" constraint
Gemini ignores styleAdd more negative constraints, be more specific about hex colors
Blurry figures in PDFExport as PDF (vector), not PNG; or use 300+ DPI for PNG
Legend overlaps dataUse
bbox_to_anchor
,
loc="upper left"
, or external legend
Too many tick labelsUse
ax.xaxis.set_major_locator(MaxNLocator(5))
问题解决方案
LaTeX中字体显示异常导出PDF,设置
text.usetex=True
,或使用
font.family=serif
图表超出栏宽检查会议宽度限制,使用英寸为单位设置
figsize
打印时颜色难以区分使用色盲友好调色板+不同的线型/标记点
Gemini拼写错误标签在提示词中精准写出所有标签,添加"SPELL EXACTLY"约束
Gemini忽略样式要求添加更多负向约束,明确指定十六进制颜色码
PDF中图表模糊导出PDF(矢量图)而非PNG;PNG使用300+ DPI
图例覆盖数据使用
bbox_to_anchor
loc="upper left"
,或外置图例
刻度标签过多使用
ax.xaxis.set_major_locator(MaxNLocator(5))

When to Use vs Alternatives

适用场景与替代方案对比

NeedThis SkillAlternative
Architecture diagramsGemini generationTikZ (manual), draw.io (interactive), Mermaid (simple)
Data chartsmatplotlib/seabornPlotly (interactive), R/ggplot2 (statistics-heavy)
Full paper writingUse with
ml-paper-writing
Poster figuresLarger fonts, wider
latex-posters
skill
Presentation figuresLarger text, fewer detailsPowerPoint/Keynote export

需求本工具替代方案
架构图Gemini生成TikZ(手动绘制)、draw.io(交互绘制)、Mermaid(简单场景)
数据图表matplotlib/seabornPlotly(交互场景)、R/ggplot2(统计密集型场景)
完整论文写作搭配
ml-paper-writing
使用
海报图表更大字体、更宽尺寸
latex-posters
skill
演示文稿图表更大字号、更少细节PowerPoint/Keynote导出

Quick Reference: File Naming Convention

快速参考:文件命名规范

figures/
├── gen_fig_<name>.py      # Generation script (always save for reproducibility)
├── fig_<name>.pdf         # Final vector output (for LaTeX)
├── fig_<name>.png         # Raster output (300 DPI, for AI-generated or fallback)
└── fig_<name>_attempt*.png # Gemini attempts (keep for comparison)
figures/
├── gen_fig_<name>.py      # Generation script (always save for reproducibility)
├── fig_<name>.pdf         # Final vector output (for LaTeX)
├── fig_<name>.png         # Raster output (300 DPI, for AI-generated or fallback)
└── fig_<name>_attempt*.png # Gemini attempts (keep for comparison)