lovstudio-document-illustrator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Document Illustrator Skill

Document Illustrator Skill

基于 AI 智能分析的文档配图生成工具。全局规划、并行生成、异步插入,高效为文档添加配图。
An AI-powered document illustration generation tool based on intelligent analysis. Global planning, parallel generation, asynchronous insertion, efficiently add illustrations to documents.

核心流程(5 步)

Core Workflow (5 Steps)

备份 → 全局规划插入点 → 并行生成图片 → 异步插入原文 → 清理备份
Backup → Global insertion point planning → Parallel image generation → Asynchronous insertion into original document → Backup cleanup

Step 0: 备份原文件

Step 0: Backup Original File

在修改前先创建备份,确保安全回滚:
python
import shutil
backup_path = f"{doc_path}.illustrator-backup"
shutil.copy2(doc_path, backup_path)
所有后续操作直接在原文件上进行。
Create a backup before modification to ensure safe rollback:
python
import shutil
backup_path = f"{doc_path}.illustrator-backup"
shutil.copy2(doc_path, backup_path)
All subsequent operations are performed directly on the original file.

Step 1: 全局确定所有插入位置

Step 1: Globally Determine All Insertion Positions

读取完整文档,一次性规划所有图片的插入位置:
  1. 使用 Read 工具读取完整文档
  2. AI 分析内容结构,识别核心主题
  3. 为每个主题确定精确的插入锚点(行号 + 上下文文本)
  4. 输出一份插入计划表:
插入计划:
  [1] 行 15 后 | 锚点: "## Rules 的诞生" | 主题: Rules 演化历程
  [2] 行 42 后 | 锚点: "## Commands 打包" | 主题: 工作流打包
  [3] 行 78 后 | 锚点: "## MCP 动态能力" | 主题: 第三方集成
  ...
  [cover] 行 1 前 | 封面图 | 主题: 全文概要
关键:插入锚点使用上下文文本(而非纯行号),这样即使前面的插入导致行号偏移,后续插入仍可通过锚点定位。
Read the complete document and plan all image insertion positions in one go:
  1. Use the Read tool to read the complete document
  2. AI analyzes content structure and identifies core topics
  3. Determine precise insertion anchors (line number + contextual text) for each topic
  4. Output an insertion plan:
Insertion Plan:
  [1] After line 15 | Anchor: "## Birth of Rules" | Topic: Evolution of Rules
  [2] After line 42 | Anchor: "## Commands Packaging" | Topic: Workflow packaging
  [3] After line 78 | Anchor: "## MCP Dynamic Capabilities" | Topic: Third-party integration
  ...
  [cover] Before line 1 | Cover image | Topic: Full document summary
Key: Use contextual text (instead of pure line numbers) as insertion anchors, so even if previous insertions cause line number shifts, subsequent insertions can still be positioned via anchors.

Step 2: 并行生成所有图片

Step 2: Generate All Images in Parallel

用 Agent 工具并行启动所有图片生成子任务:
对每个插入计划项,同时启动一个 Agent:
  Agent 1: generate_single_image.py --title "..." --content "..." --output images/illustration-01.png
  Agent 2: generate_single_image.py --title "..." --content "..." --output images/illustration-02.png
  Agent 3: generate_single_image.py --title "..." --content "..." --output images/illustration-03.png
  ...
  • 所有 Agent 并发执行,不互相等待
  • 每个 Agent 完成后返回图片路径或错误信息
  • 预期总耗时 = 单张耗时(10-20s),而非 N * 单张耗时
Use Agent tools to launch all image generation subtasks in parallel:
For each item in the insertion plan, start an Agent simultaneously:
  Agent 1: generate_single_image.py --title "..." --content "..." --output images/illustration-01.png
  Agent 2: generate_single_image.py --title "..." --content "..." --output images/illustration-02.png
  Agent 3: generate_single_image.py --title "..." --content "..." --output images/illustration-03.png
  ...
  • All Agents execute concurrently without waiting for each other
  • Each Agent returns the image path or error message upon completion
  • Expected total time = time for single image (10-20s), not N * time for single image

Step 3: 异步插入原文

Step 3: Asynchronously Insert into Original Document

每个 Agent 完成后立即插入,不等待其他 Agent:
  1. Agent 完成 → 获得图片路径
  2. 在原文档中通过锚点文本定位插入位置(不依赖行号)
  3. 使用 Edit 工具在锚点后插入 Markdown 图片引用:
    markdown
    ![主题描述](images/illustration-01.png)
  4. 插入使用锚点文本匹配,所以前面的插入不影响后面的定位
位置偏移处理
  • 每次插入会增加文档行数
  • 使用锚点文本(如
    ## Rules 的诞生
    )而非行号来定位
  • 从文档末尾向开头方向插入也可避免偏移问题
Insert immediately after each Agent completes, without waiting for other Agents:
  1. Agent completes → Obtain image path
  2. Locate the insertion position in the original document via anchor text (not dependent on line numbers)
  3. Use the Edit tool to insert Markdown image reference after the anchor:
    markdown
    ![Topic description](images/illustration-01.png)
  4. Insertion uses anchor text matching, so previous insertions do not affect subsequent positioning
Position Shift Handling:
  • Each insertion increases the number of document lines
  • Use anchor text (e.g.,
    ## Birth of Rules
    ) instead of line numbers for positioning
  • Inserting from the end of the document to the beginning can also avoid offset issues

Step 4: 验证与清理

Step 4: Verification and Cleanup

所有图片插入完成后:
  1. 验证:检查原文档中所有计划的
    ![...]()
    引用都已插入
  2. 验证:检查所有图片文件都存在于
    images/
    目录
  3. 成功 → 删除备份文件
    {doc_path}.illustrator-backup
  4. 失败 → 保留备份文件,报告哪些图片未能生成/插入,用户可用备份恢复
完成: 6/6 张配图已插入原文档
已清理备份文件
After all images are inserted:
  1. Verification: Check that all planned
    ![...]()
    references have been inserted into the original document
  2. Verification: Check that all image files exist in the
    images/
    directory
  3. Success → Delete the backup file
    {doc_path}.illustrator-backup
  4. Failure → Keep the backup file, report which images failed to generate/insert, and users can restore using the backup
Completed: 6/6 illustrations have been inserted into the original document
Backup file cleaned up

配置选项

Configuration Options

执行前 Claude 会询问(或从用户消息中推断):
选项默认
图片比例16:9 / 3:416:9
是否封面图是/否
内容配图数量3-10根据文档长度推荐
风格gradient-glass / ticket / vector-illustrationgradient-glass
如果用户在请求中已指定(如"竖屏、票据风格、8张"),直接使用,不再询问。
Claude will ask (or infer from user messages) before execution:
OptionValuesDefault
Image Aspect Ratio16:9 / 3:416:9
Include Cover ImageYes/NoNo
Number of Content Illustrations3-10Recommended based on document length
Stylegradient-glass / ticket / vector-illustrationgradient-glass
If the user has specified in the request (e.g., "portrait, ticket style, 8 images"), use the specified values directly without asking.

风格速查

Style Quick Reference

风格关键词适合
gradient-glass玻璃拟态、极光渐变、科技感技术文档、产品介绍
ticket黑白对比、票券结构、极简数据报告、信息图表
vector-illustration扁平插画、复古配色、几何化教程、故事、品牌
风格文件位于
styles/
目录。
StyleKeywordsSuitable For
gradient-glassGlass morphism, aurora gradient, tech vibeTechnical documents, product introductions
ticketBlack-white contrast, ticket structure, minimalismData reports, infographics
vector-illustrationFlat illustration, retro color scheme, geometricTutorials, stories, branding
Style files are located in the
styles/
directory.

技术细节

Technical Details

项目
API 模型Gemini 2.0 Flash Image Preview
16:9 分辨率2560x1440 (2K) / 3840x2160 (4K)
3:4 分辨率1920x2560 (2K) / 2880x3840 (4K)
单张耗时~10-20s
并行耗时~10-20s(总,不乘 N)
依赖
pip install google-genai pillow python-dotenv
API Key
.env
GEMINI_API_KEY
或环境变量
ItemValue
API ModelGemini 2.0 Flash Image Preview
16:9 Resolution2560x1440 (2K) / 3840x2160 (4K)
3:4 Resolution1920x2560 (2K) / 2880x3840 (4K)
Single Image Time~10-20s
Parallel Processing Time~10-20s (total, not multiplied by N)
Dependencies
pip install google-genai pillow python-dotenv
API Key
GEMINI_API_KEY
in
.env
or environment variable

脚本

Scripts

  • scripts/generate_single_image.py
    — 单张图片生成(供 Agent 并行调用)
  • scripts/generate_illustrations.py
    — 旧版批量顺序生成(保留兼容)
  • scripts/generate_single_image.py
    — Single image generation (for parallel Agent calls)
  • scripts/generate_illustrations.py
    — Legacy batch sequential generation (retained for compatibility)