lovstudio-document-illustrator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Document Illustrator Skill

基于 AI 智能分析的文档配图生成工具。全局规划、并行生成、异步插入，高效为文档添加配图。

An AI-powered document illustration generation tool based on intelligent analysis. Global planning, parallel generation, asynchronous insertion, efficiently add illustrations to documents.

核心流程（5 步）

Core Workflow (5 Steps)

备份 → 全局规划插入点 → 并行生成图片 → 异步插入原文 → 清理备份

Backup → Global insertion point planning → Parallel image generation → Asynchronous insertion into original document → Backup cleanup

Step 0: 备份原文件

Step 0: Backup Original File

在修改前先创建备份，确保安全回滚：

python

import shutil
backup_path = f"{doc_path}.illustrator-backup"
shutil.copy2(doc_path, backup_path)

所有后续操作直接在原文件上进行。

Create a backup before modification to ensure safe rollback:

python

import shutil
backup_path = f"{doc_path}.illustrator-backup"
shutil.copy2(doc_path, backup_path)

All subsequent operations are performed directly on the original file.

Step 1: 全局确定所有插入位置

Step 1: Globally Determine All Insertion Positions

读取完整文档，一次性规划所有图片的插入位置：

使用 Read 工具读取完整文档
AI 分析内容结构，识别核心主题
为每个主题确定精确的插入锚点（行号 + 上下文文本）
输出一份插入计划表：

插入计划：
  [1] 行 15 后 | 锚点: "## Rules 的诞生" | 主题: Rules 演化历程
  [2] 行 42 后 | 锚点: "## Commands 打包" | 主题: 工作流打包
  [3] 行 78 后 | 锚点: "## MCP 动态能力" | 主题: 第三方集成
  ...
  [cover] 行 1 前 | 封面图 | 主题: 全文概要

关键：插入锚点使用上下文文本（而非纯行号），这样即使前面的插入导致行号偏移，后续插入仍可通过锚点定位。

Read the complete document and plan all image insertion positions in one go:

Use the Read tool to read the complete document
AI analyzes content structure and identifies core topics
Determine precise insertion anchors (line number + contextual text) for each topic
Output an insertion plan:

Insertion Plan:
  [1] After line 15 | Anchor: "## Birth of Rules" | Topic: Evolution of Rules
  [2] After line 42 | Anchor: "## Commands Packaging" | Topic: Workflow packaging
  [3] After line 78 | Anchor: "## MCP Dynamic Capabilities" | Topic: Third-party integration
  ...
  [cover] Before line 1 | Cover image | Topic: Full document summary

Key: Use contextual text (instead of pure line numbers) as insertion anchors, so even if previous insertions cause line number shifts, subsequent insertions can still be positioned via anchors.

Step 2: 并行生成所有图片

Step 2: Generate All Images in Parallel

用 Agent 工具并行启动所有图片生成子任务：

对每个插入计划项，同时启动一个 Agent：
  Agent 1: generate_single_image.py --title "..." --content "..." --output images/illustration-01.png
  Agent 2: generate_single_image.py --title "..." --content "..." --output images/illustration-02.png
  Agent 3: generate_single_image.py --title "..." --content "..." --output images/illustration-03.png
  ...

所有 Agent 并发执行，不互相等待
每个 Agent 完成后返回图片路径或错误信息
预期总耗时 = 单张耗时（10-20s），而非 N * 单张耗时

Use Agent tools to launch all image generation subtasks in parallel:

For each item in the insertion plan, start an Agent simultaneously:
  Agent 1: generate_single_image.py --title "..." --content "..." --output images/illustration-01.png
  Agent 2: generate_single_image.py --title "..." --content "..." --output images/illustration-02.png
  Agent 3: generate_single_image.py --title "..." --content "..." --output images/illustration-03.png
  ...

All Agents execute concurrently without waiting for each other
Each Agent returns the image path or error message upon completion
Expected total time = time for single image (10-20s), not N * time for single image

Step 3: 异步插入原文

Step 3: Asynchronously Insert into Original Document

每个 Agent 完成后立即插入，不等待其他 Agent：

Agent 完成 → 获得图片路径
在原文档中通过锚点文本定位插入位置（不依赖行号）
使用 Edit 工具在锚点后插入 Markdown 图片引用：
markdown
```
![主题描述](images/illustration-01.png)
```
插入使用锚点文本匹配，所以前面的插入不影响后面的定位

位置偏移处理：

每次插入会增加文档行数
使用锚点文本（如
```
## Rules 的诞生
```
）而非行号来定位
从文档末尾向开头方向插入也可避免偏移问题

Insert immediately after each Agent completes, without waiting for other Agents:

Agent completes → Obtain image path
Locate the insertion position in the original document via anchor text (not dependent on line numbers)
Use the Edit tool to insert Markdown image reference after the anchor:
markdown
```
![Topic description](images/illustration-01.png)
```
Insertion uses anchor text matching, so previous insertions do not affect subsequent positioning

Position Shift Handling:

Each insertion increases the number of document lines
Use anchor text (e.g.,
```
## Birth of Rules
```
) instead of line numbers for positioning
Inserting from the end of the document to the beginning can also avoid offset issues

Step 4: 验证与清理

Step 4: Verification and Cleanup

所有图片插入完成后：

验证：检查原文档中所有计划的
```
![...]()
```
引用都已插入
验证：检查所有图片文件都存在于
```
images/
```
目录
成功 → 删除备份文件
```
{doc_path}.illustrator-backup
```
失败 → 保留备份文件，报告哪些图片未能生成/插入，用户可用备份恢复

完成: 6/6 张配图已插入原文档
已清理备份文件

After all images are inserted:

Verification: Check that all planned
```
![...]()
```
references have been inserted into the original document
Verification: Check that all image files exist in the
```
images/
```
directory
Success → Delete the backup file
```
{doc_path}.illustrator-backup
```
Failure → Keep the backup file, report which images failed to generate/insert, and users can restore using the backup

Completed: 6/6 illustrations have been inserted into the original document
Backup file cleaned up

配置选项

Configuration Options

执行前 Claude 会询问（或从用户消息中推断）：

选项	值	默认
图片比例	16:9 / 3:4	16:9
是否封面图	是/否	否
内容配图数量	3-10	根据文档长度推荐
风格	gradient-glass / ticket / vector-illustration	gradient-glass

如果用户在请求中已指定（如"竖屏、票据风格、8张"），直接使用，不再询问。

Claude will ask (or infer from user messages) before execution:

Option	Values	Default
Image Aspect Ratio	16:9 / 3:4	16:9
Include Cover Image	Yes/No	No
Number of Content Illustrations	3-10	Recommended based on document length
Style	gradient-glass / ticket / vector-illustration	gradient-glass

If the user has specified in the request (e.g., "portrait, ticket style, 8 images"), use the specified values directly without asking.

风格速查

Style Quick Reference

风格	关键词	适合
gradient-glass	玻璃拟态、极光渐变、科技感	技术文档、产品介绍
ticket	黑白对比、票券结构、极简	数据报告、信息图表
vector-illustration	扁平插画、复古配色、几何化	教程、故事、品牌

风格文件位于

styles/

目录。

Style	Keywords	Suitable For
gradient-glass	Glass morphism, aurora gradient, tech vibe	Technical documents, product introductions
ticket	Black-white contrast, ticket structure, minimalism	Data reports, infographics
vector-illustration	Flat illustration, retro color scheme, geometric	Tutorials, stories, branding

Style files are located in the

styles/

directory.

技术细节

Technical Details

项目	值
API 模型	Gemini 2.0 Flash Image Preview
16:9 分辨率	2560x1440 (2K) / 3840x2160 (4K)
3:4 分辨率	1920x2560 (2K) / 2880x3840 (4K)
单张耗时	~10-20s
并行耗时	~10-20s（总，不乘 N）
依赖	`pip install google-genai pillow python-dotenv`
API Key	`.env` 中 `GEMINI_API_KEY` 或环境变量

Item	Value
API Model	Gemini 2.0 Flash Image Preview
16:9 Resolution	2560x1440 (2K) / 3840x2160 (4K)
3:4 Resolution	1920x2560 (2K) / 2880x3840 (4K)
Single Image Time	~10-20s
Parallel Processing Time	~10-20s (total, not multiplied by N)
Dependencies	`pip install google-genai pillow python-dotenv`
API Key	`GEMINI_API_KEY` in `.env` or environment variable

脚本

Scripts

```
scripts/generate_single_image.py
```
— 单张图片生成（供 Agent 并行调用）
```
scripts/generate_illustrations.py
```
— 旧版批量顺序生成（保留兼容）

```
scripts/generate_single_image.py
```
— Single image generation (for parallel Agent calls)
```
scripts/generate_illustrations.py
```
— Legacy batch sequential generation (retained for compatibility)