lovstudio-document-illustrator
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDocument Illustrator Skill
Document Illustrator Skill
基于 AI 智能分析的文档配图生成工具。全局规划、并行生成、异步插入,高效为文档添加配图。
An AI-powered document illustration generation tool based on intelligent analysis. Global planning, parallel generation, asynchronous insertion, efficiently add illustrations to documents.
核心流程(5 步)
Core Workflow (5 Steps)
备份 → 全局规划插入点 → 并行生成图片 → 异步插入原文 → 清理备份Backup → Global insertion point planning → Parallel image generation → Asynchronous insertion into original document → Backup cleanupStep 0: 备份原文件
Step 0: Backup Original File
在修改前先创建备份,确保安全回滚:
python
import shutil
backup_path = f"{doc_path}.illustrator-backup"
shutil.copy2(doc_path, backup_path)所有后续操作直接在原文件上进行。
Create a backup before modification to ensure safe rollback:
python
import shutil
backup_path = f"{doc_path}.illustrator-backup"
shutil.copy2(doc_path, backup_path)All subsequent operations are performed directly on the original file.
Step 1: 全局确定所有插入位置
Step 1: Globally Determine All Insertion Positions
读取完整文档,一次性规划所有图片的插入位置:
- 使用 Read 工具读取完整文档
- AI 分析内容结构,识别核心主题
- 为每个主题确定精确的插入锚点(行号 + 上下文文本)
- 输出一份插入计划表:
插入计划:
[1] 行 15 后 | 锚点: "## Rules 的诞生" | 主题: Rules 演化历程
[2] 行 42 后 | 锚点: "## Commands 打包" | 主题: 工作流打包
[3] 行 78 后 | 锚点: "## MCP 动态能力" | 主题: 第三方集成
...
[cover] 行 1 前 | 封面图 | 主题: 全文概要关键:插入锚点使用上下文文本(而非纯行号),这样即使前面的插入导致行号偏移,后续插入仍可通过锚点定位。
Read the complete document and plan all image insertion positions in one go:
- Use the Read tool to read the complete document
- AI analyzes content structure and identifies core topics
- Determine precise insertion anchors (line number + contextual text) for each topic
- Output an insertion plan:
Insertion Plan:
[1] After line 15 | Anchor: "## Birth of Rules" | Topic: Evolution of Rules
[2] After line 42 | Anchor: "## Commands Packaging" | Topic: Workflow packaging
[3] After line 78 | Anchor: "## MCP Dynamic Capabilities" | Topic: Third-party integration
...
[cover] Before line 1 | Cover image | Topic: Full document summaryKey: Use contextual text (instead of pure line numbers) as insertion anchors, so even if previous insertions cause line number shifts, subsequent insertions can still be positioned via anchors.
Step 2: 并行生成所有图片
Step 2: Generate All Images in Parallel
用 Agent 工具并行启动所有图片生成子任务:
对每个插入计划项,同时启动一个 Agent:
Agent 1: generate_single_image.py --title "..." --content "..." --output images/illustration-01.png
Agent 2: generate_single_image.py --title "..." --content "..." --output images/illustration-02.png
Agent 3: generate_single_image.py --title "..." --content "..." --output images/illustration-03.png
...- 所有 Agent 并发执行,不互相等待
- 每个 Agent 完成后返回图片路径或错误信息
- 预期总耗时 = 单张耗时(10-20s),而非 N * 单张耗时
Use Agent tools to launch all image generation subtasks in parallel:
For each item in the insertion plan, start an Agent simultaneously:
Agent 1: generate_single_image.py --title "..." --content "..." --output images/illustration-01.png
Agent 2: generate_single_image.py --title "..." --content "..." --output images/illustration-02.png
Agent 3: generate_single_image.py --title "..." --content "..." --output images/illustration-03.png
...- All Agents execute concurrently without waiting for each other
- Each Agent returns the image path or error message upon completion
- Expected total time = time for single image (10-20s), not N * time for single image
Step 3: 异步插入原文
Step 3: Asynchronously Insert into Original Document
每个 Agent 完成后立即插入,不等待其他 Agent:
- Agent 完成 → 获得图片路径
- 在原文档中通过锚点文本定位插入位置(不依赖行号)
- 使用 Edit 工具在锚点后插入 Markdown 图片引用:
markdown
 - 插入使用锚点文本匹配,所以前面的插入不影响后面的定位
位置偏移处理:
- 每次插入会增加文档行数
- 使用锚点文本(如 )而非行号来定位
## Rules 的诞生 - 从文档末尾向开头方向插入也可避免偏移问题
Insert immediately after each Agent completes, without waiting for other Agents:
- Agent completes → Obtain image path
- Locate the insertion position in the original document via anchor text (not dependent on line numbers)
- Use the Edit tool to insert Markdown image reference after the anchor:
markdown
 - Insertion uses anchor text matching, so previous insertions do not affect subsequent positioning
Position Shift Handling:
- Each insertion increases the number of document lines
- Use anchor text (e.g., ) instead of line numbers for positioning
## Birth of Rules - Inserting from the end of the document to the beginning can also avoid offset issues
Step 4: 验证与清理
Step 4: Verification and Cleanup
所有图片插入完成后:
- 验证:检查原文档中所有计划的 引用都已插入
![...]() - 验证:检查所有图片文件都存在于 目录
images/ - 成功 → 删除备份文件
{doc_path}.illustrator-backup - 失败 → 保留备份文件,报告哪些图片未能生成/插入,用户可用备份恢复
完成: 6/6 张配图已插入原文档
已清理备份文件After all images are inserted:
- Verification: Check that all planned references have been inserted into the original document
![...]() - Verification: Check that all image files exist in the directory
images/ - Success → Delete the backup file
{doc_path}.illustrator-backup - Failure → Keep the backup file, report which images failed to generate/insert, and users can restore using the backup
Completed: 6/6 illustrations have been inserted into the original document
Backup file cleaned up配置选项
Configuration Options
执行前 Claude 会询问(或从用户消息中推断):
| 选项 | 值 | 默认 |
|---|---|---|
| 图片比例 | 16:9 / 3:4 | 16:9 |
| 是否封面图 | 是/否 | 否 |
| 内容配图数量 | 3-10 | 根据文档长度推荐 |
| 风格 | gradient-glass / ticket / vector-illustration | gradient-glass |
如果用户在请求中已指定(如"竖屏、票据风格、8张"),直接使用,不再询问。
Claude will ask (or infer from user messages) before execution:
| Option | Values | Default |
|---|---|---|
| Image Aspect Ratio | 16:9 / 3:4 | 16:9 |
| Include Cover Image | Yes/No | No |
| Number of Content Illustrations | 3-10 | Recommended based on document length |
| Style | gradient-glass / ticket / vector-illustration | gradient-glass |
If the user has specified in the request (e.g., "portrait, ticket style, 8 images"), use the specified values directly without asking.
风格速查
Style Quick Reference
| 风格 | 关键词 | 适合 |
|---|---|---|
| gradient-glass | 玻璃拟态、极光渐变、科技感 | 技术文档、产品介绍 |
| ticket | 黑白对比、票券结构、极简 | 数据报告、信息图表 |
| vector-illustration | 扁平插画、复古配色、几何化 | 教程、故事、品牌 |
风格文件位于 目录。
styles/| Style | Keywords | Suitable For |
|---|---|---|
| gradient-glass | Glass morphism, aurora gradient, tech vibe | Technical documents, product introductions |
| ticket | Black-white contrast, ticket structure, minimalism | Data reports, infographics |
| vector-illustration | Flat illustration, retro color scheme, geometric | Tutorials, stories, branding |
Style files are located in the directory.
styles/技术细节
Technical Details
| 项目 | 值 |
|---|---|
| API 模型 | Gemini 2.0 Flash Image Preview |
| 16:9 分辨率 | 2560x1440 (2K) / 3840x2160 (4K) |
| 3:4 分辨率 | 1920x2560 (2K) / 2880x3840 (4K) |
| 单张耗时 | ~10-20s |
| 并行耗时 | ~10-20s(总,不乘 N) |
| 依赖 | |
| API Key | |
| Item | Value |
|---|---|
| API Model | Gemini 2.0 Flash Image Preview |
| 16:9 Resolution | 2560x1440 (2K) / 3840x2160 (4K) |
| 3:4 Resolution | 1920x2560 (2K) / 2880x3840 (4K) |
| Single Image Time | ~10-20s |
| Parallel Processing Time | ~10-20s (total, not multiplied by N) |
| Dependencies | |
| API Key | |
脚本
Scripts
- — 单张图片生成(供 Agent 并行调用)
scripts/generate_single_image.py - — 旧版批量顺序生成(保留兼容)
scripts/generate_illustrations.py
- — Single image generation (for parallel Agent calls)
scripts/generate_single_image.py - — Legacy batch sequential generation (retained for compatibility)
scripts/generate_illustrations.py