codex-ppt

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Codex PPT

Overview

概述

This skill creates image-based PPT decks. Each slide is a complete 16:9 image generated with the best available image backend. The image contains the slide title, key points, and visual composition. The generated images are then assembled into a

.pptx

file with

scripts/assemble_ppt.py

Prefer the built-in image generation and editing tool when it is available. If it is unavailable, or if the user explicitly requests API/CLI mode, use this skill's local fallback CLI at

scripts/image_gen.py

本技能用于创建基于图片的PPT演示文稿。每张幻灯片都是使用最佳可用图片后端生成的完整16:9比例图片，包含幻灯片标题、要点和视觉构图。生成的图片随后通过

scripts/assemble_ppt.py

组装为

.pptx

文件。

当内置图片生成与编辑工具可用时，优先使用该工具。若内置工具不可用，或用户明确要求API/CLI模式，则使用本技能的本地备用CLI工具

scripts/image_gen.py

。

Use When

使用场景

Use this skill when the user asks to:

Turn an article, report, paper, document, course note, or rough outline into a PPT.
Create a visually consistent presentation deck.
Generate slides as full-page images.
Produce supporting
```
outline.md
```
and
```
speech.md
```
files.
Assemble generated slide images into a
```
.pptx
```
.

Do not use this skill for ordinary editable PowerPoint layouts where each textbox, chart, or shape must remain separately editable. This workflow prioritizes visual quality and consistency over editability.

当用户提出以下需求时，使用本技能：

将文章、报告、论文、文档、课程笔记或粗略大纲转换为PPT。
创建视觉风格一致的演示文稿。
生成整页图片形式的幻灯片。
生成配套的
```
outline.md
```
和
```
speech.md
```
文件。
将生成的幻灯片图片组装为
```
.pptx
```
文件。

请勿将本技能用于普通可编辑PowerPoint布局场景（即每个文本框、图表或形状需保持独立可编辑的情况）。本工作流优先考虑视觉质量和一致性，而非可编辑性。

Image Generation Backends

图片生成后端

This skill supports two image backends:

Built-in image tool, preferred when available. Example tool names: Codex
```
image_gen
```
; OpenClaw
```
image_generate
```
.
Local API/CLI fallback, using
```
scripts/image_gen.py
```
.

Backend selection rules:

Prefer the built-in image tool when available. In Codex, this usually means the built-in
```
image_gen
```
tool. In OpenClaw, this may be
```
image_generate
```
. Resolution, quality, aspect ratio, slide-edit requests, or the user saying “use
```
gpt-image-2
```
” do not require CLI/API fallback.
In Codex, treat the built-in image tool as the preferred
```
gpt-image-2
```
path when it is available. If the user has a GPT subscription / Codex environment and asks for
```
gpt-image-2
```
, do not switch to
```
scripts/image_gen.py
```
only to satisfy the model name.
Use CLI/API fallback only when the built-in tool is unavailable, the user explicitly asks for API/CLI or a third-party OpenAI-compatible proxy, or the requested capability is unavailable in the built-in tool.
Before generating the first image, tell the user which backend you plan to use, why, and ask for confirmation. Do not treat being in a specific agent environment as proof that the built-in image tool is available.
CLI/API fallback loads
```
~/.codex-ppt-skill/.env
```
automatically. Run the CLI normally; do not manually parse
```
.env
```
or ask for configuration before an error.
Ask for
```
OPENAI_API_KEY
```
configuration only after you have intentionally selected CLI/API fallback and that fallback reports missing config, after authentication/base URL/model errors, or when the user explicitly wants to change API settings. Do not mention missing
```
OPENAI_API_KEY
```
while the Codex built-in image tool is available. Configure provided values with
```
scripts/codex_ppt_runtime.py config --api-key
```
.
For detailed fallback setup after an error, read
```
docs/image-model-configuration.md
```
.

CLI/API fallback commands use the shared runtime environment. Let

{skill_root}

mean the directory containing this

SKILL.md

bash

~/.codex-ppt-skill/.venv/bin/python {skill_root}/scripts/image_gen.py generate \
  --model gpt-image-2 \
  --prompt-file {prompt_file} \
  --size 2560x1440 \
  --quality medium \
  --out {base_dir}/{deck_name}/origin_image/slide_01.png

For CLI/API fallback, first make sure dependencies are installed:

bash

python3 {skill_root}/scripts/codex_ppt_runtime.py bootstrap

Use the shared runtime config for real API calls. The fallback CLI loads existing config automatically; only load

docs/image-model-configuration.md

after the CLI reports missing config, when the user explicitly wants to change API key, base URL, or model, or when a real API call reports authentication, permission, base URL, or model availability failure. The fallback CLI accepts model names containing

gpt-image-

, such as

gpt-image-2

openai/gpt-image-2

The fallback CLI supports:

```
generate
```
: create one or more images from a prompt.
```
edit
```
: edit one or more existing images, optionally with a mask.
```
generate-batch
```
: generate many slide images from a JSONL prompt file.

The fallback CLI defaults to 2K 16:9 landscape output,

2560x1440

, because it keeps slide text clearer while staying below the

gpt-image-2

pixel limit. For 4K landscape slides, use

--size 3840x2160 --quality high

only when the user asks for 4K, text-heavy slides need sharper output, or the default result is blurry. For portrait assets, use

--size 2160x3840

only if the user requests portrait output.

Transparent-background requests:

Built-in mode should use a flat chroma-key background and local removal when appropriate.
CLI/API fallback should also prefer chroma-key generation plus
```
scripts/remove_chroma_key.py
```
for simple opaque subjects.

gpt-image-2

does not support

--background transparent

. If the user needs true model-native transparency, ask before switching to

--model gpt-image-1.5 --background transparent --output-format png

本技能支持两种图片生成后端：

内置图片工具，可用时优先使用。示例工具名称：Codex
```
image_gen
```
；OpenClaw
```
image_generate
```
。
本地API/CLI备用方案，使用
```
scripts/image_gen.py
```
。

后端选择规则：

内置图片工具可用时优先使用。在Codex中，通常指内置的
```
image_gen
```
工具；在OpenClaw中，可能是
```
image_generate
```
。分辨率、质量、宽高比、幻灯片编辑请求，或用户要求“使用
```
gpt-image-2
```
”等情况，无需切换到CLI/API备用方案。
在Codex中，当内置图片工具可用时，将其视为首选的
```
gpt-image-2
```
调用路径。若用户拥有GPT订阅/Codex环境并要求使用
```
gpt-image-2
```
，不要仅为满足模型名称而切换到
```
scripts/image_gen.py
```
。
仅在内置工具不可用、用户明确要求API/CLI或第三方OpenAI兼容代理，或内置工具不支持所需功能时，才使用CLI/API备用方案。
在生成第一张图片前，告知用户计划使用的后端、原因，并请求确认。不要仅凭处于特定Agent环境就认定内置图片工具可用。
CLI/API备用方案会自动加载
```
~/.codex-ppt-skill/.env
```
文件。正常运行CLI即可；在出现错误前，无需手动解析
```
.env
```
或询问配置信息。
仅在有意选择CLI/API备用方案且该方案报告缺少配置、认证/基础URL/模型错误，或用户明确想要更改API设置时，才请求配置
```
OPENAI_API_KEY
```
。当Codex内置图片工具可用时，不要提及缺少
```
OPENAI_API_KEY
```
。使用
```
scripts/codex_ppt_runtime.py config --api-key
```
配置提供的参数值。
若出现错误后需要详细的备用方案设置，请阅读
```
docs/image-model-configuration.md
```
。

CLI/API备用方案命令使用共享运行时环境。

{skill_root}

指包含本

SKILL.md

文件的目录。

bash

~/.codex-ppt-skill/.venv/bin/python {skill_root}/scripts/image_gen.py generate \
  --model gpt-image-2 \
  --prompt-file {prompt_file} \
  --size 2560x1440 \
  --quality medium \
  --out {base_dir}/{deck_name}/origin_image/slide_01.png

对于CLI/API备用方案，请先确保依赖已安装：

bash

python3 {skill_root}/scripts/codex_ppt_runtime.py bootstrap

使用共享运行时配置进行真实API调用。备用CLI会自动加载现有配置；仅当CLI报告缺少配置、用户明确想要更改API密钥、基础URL或模型，或真实API调用报告认证、权限、基础URL或模型可用性失败时，才查看

docs/image-model-configuration.md

。备用CLI接受包含

gpt-image-

的模型名称，例如

gpt-image-2

或

openai/gpt-image-2

。

备用CLI支持以下功能：

```
generate
```
：根据提示创建一张或多张图片。
```
edit
```
：编辑一张或多张现有图片，可选择使用遮罩。
```
generate-batch
```
：从JSONL提示文件生成多张幻灯片图片。

备用CLI默认输出2K 16:9横向图片，分辨率为

2560x1440

，因为该分辨率既能保证幻灯片文字清晰，又不会超过

gpt-image-2

的像素限制。仅当用户要求4K、文字密集的幻灯片需要更清晰的输出，或默认结果模糊时，才使用

--size 3840x2160 --quality high

生成4K横向幻灯片。仅当用户要求纵向输出时，才使用

--size 2160x3840

生成纵向素材。

透明背景请求处理：

内置模式应使用纯色抠像背景，并在合适时进行本地背景移除。
CLI/API备用方案也优先采用抠像生成，再配合
```
scripts/remove_chroma_key.py
```
处理简单不透明主体。
```
gpt-image-2
```
不支持
```
--background transparent
```
参数。若用户需要模型原生的真透明效果，在切换到
```
--model gpt-image-1.5 --background transparent --output-format png
```
前需先询问用户。

Workflow

工作流程

1. Understand Source Content

1. 理解源内容

Read the user-provided content fully enough to identify:

Main topic and intended audience
Presentation goal
Required or implied page count
Required style or brand constraints
Any sections that must be included or excluded

If the user did not specify a page count, choose a practical count based on content length. Typical decks are 8-12 slides.

充分阅读用户提供的内容，明确以下信息：

主题和目标受众
演示文稿的目标
所需或隐含的页数
所需的风格或品牌限制
必须包含或排除的章节

若用户未指定页数，根据内容长度选择合理的页数。典型的演示文稿为8-12页。

2. Plan The Deck Outline

2. 规划演示文稿大纲

Create a concise

outline.md

draft before generating images. For each slide, define:

Slide number
Slide title
3-5 key points
Optional visual idea
Layout role and intent, such as cover, agenda, section divider, concept explanation, process, comparison, timeline, data evidence, architecture, case study, summary, or Q&A

Save the draft to

{base_dir}/{deck_name}/outline.md

once the project directory is known. If the output directory is not known yet, show the outline in chat first and write it to

outline.md

immediately after creating the project directory.

Show the outline to the user for confirmation and wait for approval before moving to visual style selection or image generation, unless the user explicitly asked you to skip confirmation. If the user requests changes, update

outline.md

and ask for confirmation again.

Recommended structure:

text

Slide 1: Cover
Slide 2: Context / problem
Slide 3-7: Main argument or sections
Slide 8: Summary / recommendation / closing

在生成图片前，创建简洁的

outline.md

草稿。为每张幻灯片定义：

幻灯片编号
幻灯片标题
3-5个要点
可选的视觉创意
布局作用和目的，例如封面、议程、章节分隔页、概念讲解、流程、对比、时间线、数据证据、架构、案例研究、总结或问答页

确定项目目录后，将草稿保存到

{base_dir}/{deck_name}/outline.md

。若输出目录尚未确定，先在聊天中展示大纲，创建项目目录后立即写入

outline.md

。

将大纲展示给用户确认，等待用户批准后再进行视觉风格选择或图片生成，除非用户明确要求跳过确认。若用户要求修改，更新

outline.md

并再次请求确认。

推荐结构：

text

第1页：封面
第2页：背景/问题
第3-7页：核心论点或章节
第8页：总结/建议/结尾

3. Confirm A Unified Visual Style

3. 确认统一视觉风格

Before generating slide images, discuss the visual style with the user. Prefer a multiple-choice question: offer 2-3 concrete style directions and mark one as your recommendation.

Each style option should briefly specify:

Color palette
Layout system
Typography direction
Illustration or image treatment
Decorative elements
Density and whitespace rules

After the user chooses a style, create one final style direction and keep the visual identity consistent across all slide prompts. Keep color palette, typography, texture, icon/illustration language, and overall mood stable. Do not reuse the same layout on every page.

The

references/

directory contains optional style references. Use them as inspiration, not as rigid templates. Adapt the style to the topic and audience.

Important: a deck should have one coherent visual identity, not one repeated composition. Treat each reference as a style system: stable palette, typography, icon language, texture, and visual mood; variable page layout chosen from the slide's content role.

layout_blueprints

are candidate starting points only. Do not apply the same blueprint to every slide.

Available references:

```
references/清爽专业风.md
```
```
references/创意杂志风.md
```
```
references/电子墨水杂志风.md
```
```
references/数据仪表盘风.md
```
```
references/科研答辩风.md
```
```
references/复古扁平插画风.md
```
```
references/手绘技术解释风.md
```
```
references/手绘白板风.md
```
```
references/温暖手工风.md
```

Example style confirmation:

text

我建议用 A，因为它最适合这份内容的受众和表达目标。

A. 清爽专业风（推荐）：浅色背景、蓝绿强调色、结构清晰，适合汇报、答辩和技术分享。
B. 创意杂志风：大标题、强图片、留白更大胆，适合分享和传播。
C. 数据仪表盘风：指标卡、图表感布局，适合数据密集型报告。

你选哪个？也可以指定要调整的配色、布局或插画方向，或者上传一张喜欢的 PPT 风格图片让我参考。

在生成幻灯片图片前，与用户讨论视觉风格。优先采用选择题形式：提供2-3种具体的风格方向，并标记其中一种为推荐选项。

每个风格选项应简要说明：

调色板
布局体系
排版方向
插画或图片处理方式
装饰元素
内容密度和留白规则

用户选择风格后，创建最终的风格方向，并在所有幻灯片提示中保持视觉一致性。保持调色板、排版、纹理、图标/插画风格和整体氛围稳定，但不要在每页重复使用相同布局。

references/

目录包含可选的风格参考。将其作为灵感来源，而非严格模板。根据主题和受众调整风格。

重要提示：演示文稿应具有统一的视觉标识，而非重复的构图。将每个参考视为一套风格体系：稳定的调色板、排版、图标风格、纹理和视觉氛围；根据幻灯片的内容角色选择不同的页面布局。

layout_blueprints

仅作为候选起点，不要将相同蓝图应用于所有幻灯片。

可用参考：

```
references/清爽专业风.md
```
```
references/创意杂志风.md
```
```
references/电子墨水杂志风.md
```
```
references/数据仪表盘风.md
```
```
references/科研答辩风.md
```
```
references/复古扁平插画风.md
```
```
references/手绘技术解释风.md
```
```
references/手绘白板风.md
```
```
references/温暖手工风.md
```

风格确认示例：

text

我建议用A，因为它最适合这份内容的受众和表达目标。

A. 清爽专业风（推荐）：浅色背景、蓝绿强调色、结构清晰，适合汇报、答辩和技术分享。
B. 创意杂志风：大标题、强图片、留白更大胆，适合分享和传播。
C. 数据仪表盘风：指标卡、图表感布局，适合数据密集型报告。

你选哪个？也可以指定要调整的配色、布局或插画方向，或者上传一张喜欢的PPT风格图片让我参考。

4. Confirm Image Backend Before Generation

4. 生成前确认图片后端

Before generating any slide image, ask the user to confirm the image backend. Keep the confirmation short and concrete:

text

我准备使用内置图片生成工具生成样张：Codex 中通常是 image_gen，OpenClaw 中通常是 image_generate。当前环境可直接调用该工具，因此不会要求配置第三方 API。可以开始生成 1 页样张吗？

If using CLI/API fallback, say that explicitly and name the configured target:

text

我准备使用本地 API/CLI fallback 生成样张，读取 ~/.codex-ppt-skill/.env 中的 OPENAI_BASE_URL / CODEX_PPT_IMAGE_MODEL 配置。可以开始生成 1 页样张吗？

Wait for confirmation before generating the sample slide. If the user questions the backend, resolve that before continuing.

在生成任何幻灯片图片前，请用户确认图片后端。确认信息应简洁具体：

text

我准备使用内置图片生成工具生成样张：Codex中通常是image_gen，OpenClaw中通常是image_generate。当前环境可直接调用该工具，因此不会要求配置第三方API。可以开始生成1页样张吗？

若使用CLI/API备用方案，请明确说明并指定配置目标：

text

我准备使用本地API/CLI备用方案生成样张，读取~/.codex-ppt-skill/.env中的OPENAI_BASE_URL / CODEX_PPT_IMAGE_MODEL配置。可以开始生成1页样张吗？

等待用户确认后再生成样片。若用户对后端有疑问，先解决疑问再继续。

5. Generate One Sample Slide For Approval

5. 生成一张样片供批准

After the outline, style, and image backend are confirmed, generate exactly one sample slide image before full production.

Sample slide requirements:

Use the confirmed style description.
Prefer a representative content slide over the cover when possible.
Demonstrate the intended deck rhythm: the sample should show how the chosen style adapts to a real content page, not just a generic fixed template.
Save it directly as the intended final slide filename, such as
```
{base_dir}/{deck_name}/origin_image/slide_08.png
```
. In CLI/API fallback mode, use
```
scripts/image_gen.py generate --out
```
for that exact path.
Show the sample image to the user.
Ask the user to confirm the visual style, typography, layout density, and Chinese text quality.

Do not generate the full deck until the user approves the sample slide. If the user requests changes, revise the style description and regenerate that same

slide_XX.png

file first. Once approved, keep that file as the final slide for its page. Do not create

sample_slide.png

origin_image/

, because the assembly step is designed around final

slide_XX

filenames.

确认大纲、风格和图片后端后，在全面生产前仅生成一张样片。

样片要求：

使用已确认的风格描述。
尽可能选择有代表性的内容页而非封面。
展示预期的演示文稿节奏：样片应展示所选风格如何适配真实内容页，而非仅展示通用固定模板。
直接保存为最终幻灯片文件名，例如
```
{base_dir}/{deck_name}/origin_image/slide_08.png
```
。在CLI/API备用模式下，使用
```
scripts/image_gen.py generate --out
```
指定该路径。
将样片展示给用户。
请求用户确认视觉风格、排版、布局密度和中文文字质量。

在用户批准样片前，不要生成完整演示文稿。若用户要求修改，先修订风格描述并重新生成同一张

slide_XX.png

文件。获得批准后，将该文件保留为对应页面的最终幻灯片。不要在

origin_image/

中创建

sample_slide.png

，因为组装步骤是基于最终的

slide_XX

文件名设计的。

6. Create The Project Directory

6. 创建项目目录

Use this output structure:

text

{base_dir}/{deck_name}/
├── origin_image/
│   ├── slide_01.png
│   ├── slide_02.png
│   └── ...
├── outline.md
├── speech.md
└── {deck_name}.pptx

If the user did not specify a destination, use the current working directory or the directory that contains the source file.

You may initialize the directory structure with:

bash

~/.codex-ppt-skill/.venv/bin/python {skill_root}/scripts/assemble_ppt.py {base_dir} {deck_name}.pptx --init

使用以下输出结构：

text

{base_dir}/{deck_name}/
├── origin_image/
│   ├── slide_01.png
│   ├── slide_02.png
│   └── ...
├── outline.md
├── speech.md
└── {deck_name}.pptx

若用户未指定目标路径，使用当前工作目录或包含源文件的目录。

可通过以下命令初始化目录结构：

bash

~/.codex-ppt-skill/.venv/bin/python {skill_root}/scripts/assemble_ppt.py {base_dir} {deck_name}.pptx --init

7. Generate All Slide Images

7. 生成所有幻灯片图片

Generate one image per slide with the selected image backend. Every final

slide_XX.png

must be produced by the built-in image tool or by

scripts/image_gen.py

; programmatic rendering or hybrid text overlay is not acceptable for slide image creation.

Use a structured visual brief for each slide. Image generation works best when the prompt separates canvas, style, layout, text, visual elements, and constraints instead of relying only on a long style paragraph.

Keep the deck visually coherent but vary slide layouts according to page semantics. Treat style references and

layout_blueprints

as candidate patterns, not fixed templates. Across a normal deck, deliberately mix suitable page types such as:

cover / section divider
context or problem framing
process or timeline
comparison or tradeoff
data / evidence / KPI
architecture or workflow diagram
summary / conclusion / next steps

Avoid generating every slide as the same three-card layout. For each slide, choose a layout that fits its content and explain that choice in the

layout.intent

field.

json

{
  "type": "16:9 full-slide PowerPoint image",
  "language": "Chinese",
  "canvas": {
    "aspect_ratio": "16:9",
    "use_full_canvas": true,
    "slide_number": "do not render a slide number"
  },
  "style": {
    "name": "{confirmed style name}",
    "visual_direction": "{same final style description for every slide}",
    "color_palette": "{main colors and accent colors}",
    "typography": "{font personality, hierarchy, weight, text alignment}",
    "texture_and_finish": "{flat, paper, dashboard, editorial, whiteboard, etc.}",
    "deck_consistency": "same palette, typography, icon language, texture, and mood across all slides"
  },
  "layout": {
    "role": "{cover, agenda, section divider, concept, process, comparison, timeline, data evidence, architecture, case study, summary, Q&A, etc.}",
    "intent": "{why this page uses this layout: cover, comparison, timeline, data evidence, workflow, summary, etc.}",
    "composition": "{specific layout for this slide}",
    "content_zones": "{title zone, body zone, visual zone, footer or callout zones}",
    "variation_rule": "same style identity as the deck, but vary composition by slide role; do not repeat the same blueprint on adjacent slides unless the content is part of a deliberate repeated sequence",
    "relationship_to_previous_slide": "{new layout, continuation layout, mirrored layout, or deliberate repeated sequence}",
    "spacing": "clear hierarchy, coherent alignment, no overlapping elements"
  },
  "text": {
    "title": "{slide title}",
    "key_points": ["{point 1}", "{point 2}", "{point 3}"],
    "text_quality": "render all Chinese text exactly, clearly, and without garbled characters"
  },
  "visual_elements": {
    "main_visual": "{icons, diagram, chart, illustration, dashboard cards, collage, or other content-specific visual idea}",
    "supporting_elements": "{arrows, cards, callouts, decorative elements, labels}"
  },
  "constraints": [
    "The final image itself must contain the title and key points.",
    "All text must be readable and correctly spelled.",
    "Keep the confirmed style consistent with the rest of the deck.",
    "No watermark, no unrelated logo, no extra slide number."
  ]
}

Save images as:

text

{base_dir}/{deck_name}/origin_image/slide_01.png
{base_dir}/{deck_name}/origin_image/slide_02.png
...

After each image is generated, copy or move it into

{base_dir}/{deck_name}/origin_image/

immediately. Do not leave final slide images only in a temporary or default generated-images directory.

In CLI/API fallback mode, you may generate slides one at a time or use

generate-batch

. For batch generation, create a JSONL file where each job has a distinct prompt and an

out

value such as

slide_01.png

, then run:

bash

~/.codex-ppt-skill/.venv/bin/python {skill_root}/scripts/image_gen.py generate-batch \
  --input {base_dir}/{deck_name}/image_prompts.jsonl \
  --out-dir {base_dir}/{deck_name}/origin_image \
  --size 2560x1440 \
  --quality medium \
  --concurrency 5

Remove the temporary JSONL prompt file before final delivery unless the user asks to keep it.

Final slide image naming rules:

Rename final slide images strictly by slide order:
```
slide_01.png
```
,
```
slide_02.png
```
,
```
slide_03.png
```
, ...
Use zero-padded two-digit numbers for normal decks.
The approved sample slide should already have the correct
```
slide_XX.png
```
filename and should be reused directly.
Keep rejected variants, drafts, or reference images out of
```
origin_image/
```
. If you need to preserve them, place them in the project root or a separate
```
drafts/
```
directory.
Before assembling, verify every expected
```
slide_XX.png
```
exists in
```
origin_image/
```
and that there are no missing or extra final slide images.

For Chinese decks, explicitly ask the image backend to render Chinese text accurately and avoid garbled characters.

使用选定的图片后端为每张幻灯片生成一张图片。所有最终的

slide_XX.png

必须由内置图片工具或

scripts/image_gen.py

生成；不允许使用程序化渲染或混合文字叠加的方式创建幻灯片图片。

为每张幻灯片使用结构化的视觉提示。图片生成的最佳方式是将画布、风格、布局、文字、视觉元素和约束分开，而非仅依赖一段冗长的风格描述。

保持演示文稿视觉连贯，但根据页面语义改变幻灯片布局。将风格参考和

layout_blueprints

视为候选模式，而非固定模板。在普通演示文稿中，刻意混合合适的页面类型，例如：

封面/章节分隔页
背景或问题阐述页
流程或时间线页
对比或权衡页
数据/证据/KPI页
架构或工作流程图页
总结/结论/下一步计划页

避免将每张幻灯片都生成为三卡片布局。为每张幻灯片选择适合其内容的布局，并在

layout.intent

字段中说明选择理由。

json

{
  "type": "16:9 full-slide PowerPoint image",
  "language": "Chinese",
  "canvas": {
    "aspect_ratio": "16:9",
    "use_full_canvas": true,
    "slide_number": "do not render a slide number"
  },
  "style": {
    "name": "{confirmed style name}",
    "visual_direction": "{same final style description for every slide}",
    "color_palette": "{main colors and accent colors}",
    "typography": "{font personality, hierarchy, weight, text alignment}",
    "texture_and_finish": "{flat, paper, dashboard, editorial, whiteboard, etc.}",
    "deck_consistency": "same palette, typography, icon language, texture, and mood across all slides"
  },
  "layout": {
    "role": "{cover, agenda, section divider, concept, process, comparison, timeline, data evidence, architecture, case study, summary, Q&A, etc.}",
    "intent": "{why this page uses this layout: cover, comparison, timeline, data evidence, workflow, summary, etc.}",
    "composition": "{specific layout for this slide}",
    "content_zones": "{title zone, body zone, visual zone, footer or callout zones}",
    "variation_rule": "same style identity as the deck, but vary composition by slide role; do not repeat the same blueprint on adjacent slides unless the content is part of a deliberate repeated sequence",
    "relationship_to_previous_slide": "{new layout, continuation layout, mirrored layout, or deliberate repeated sequence}",
    "spacing": "clear hierarchy, coherent alignment, no overlapping elements"
  },
  "text": {
    "title": "{slide title}",
    "key_points": ["{point 1}", "{point 2}", "{point 3}"],
    "text_quality": "render all Chinese text exactly, clearly, and without garbled characters"
  },
  "visual_elements": {
    "main_visual": "{icons, diagram, chart, illustration, dashboard cards, collage, or other content-specific visual idea}",
    "supporting_elements": "{arrows, cards, callouts, decorative elements, labels}"
  },
  "constraints": [
    "The final image itself must contain the title and key points.",
    "All text must be readable and correctly spelled.",
    "Keep the confirmed style consistent with the rest of the deck.",
    "No watermark, no unrelated logo, no extra slide number."
  ]
}

图片保存路径：

text

{base_dir}/{deck_name}/origin_image/slide_01.png
{base_dir}/{deck_name}/origin_image/slide_02.png
...

每张图片生成后，立即复制或移动到

{base_dir}/{deck_name}/origin_image/

。不要将最终幻灯片图片仅留在临时或默认生成图片目录中。

在CLI/API备用模式下，可逐个生成幻灯片或使用

generate-batch

批量生成。批量生成时，创建JSONL文件，每个任务包含独特的提示和

out

值（例如

slide_01.png

），然后运行：

bash

~/.codex-ppt-skill/.venv/bin/python {skill_root}/scripts/image_gen.py generate-batch \
  --input {base_dir}/{deck_name}/image_prompts.jsonl \
  --out-dir {base_dir}/{deck_name}/origin_image \
  --size 2560x1440 \
  --quality medium \
  --concurrency 5

最终交付前，删除临时JSONL提示文件，除非用户要求保留。

最终幻灯片图片命名规则：

严格按照幻灯片顺序重命名最终图片：
```
slide_01.png
```
、
```
slide_02.png
```
、
```
slide_03.png
```
……
普通演示文稿使用两位零填充数字。
已批准的样片应已使用正确的
```
slide_XX.png
```
文件名，可直接复用。
将被拒绝的变体、草稿或参考图片放在
```
origin_image/
```
外。若需要保留，可放在项目根目录或单独的
```
drafts/
```
目录中。
组装前，验证
```
origin_image/
```
中存在所有预期的
```
slide_XX.png
```
，且无缺失或多余的最终幻灯片图片。

对于中文演示文稿，明确要求图片后端准确渲染中文文字，避免出现乱码。

8. Quality Check And Repair

8. 质量检查与修复

Before assembling the PPT, inspect every slide image. Check:

Text is readable and not garbled.
Slide content matches the outline.
Title and key points are not truncated.
Visual style is consistent across slides.
No page number appears unless the user requested one.
Important elements do not overlap.

If a slide has severe text or layout issues, regenerate it with a more constrained prompt. If a slide is mostly correct but has a localized issue, use the selected backend's edit capability when available. In CLI/API fallback mode, use

scripts/image_gen.py edit --image {slide_path} --prompt ... --out {new_slide_path}

and replace the final slide only after validating the edited output.

组装PPT前，检查每张幻灯片图片。检查内容包括：

文字可读且无乱码。
幻灯片内容与大纲一致。
标题和要点未被截断。
视觉风格在所有幻灯片中保持一致。
除非用户要求，否则无页码显示。
重要元素无重叠。

若幻灯片存在严重的文字或布局问题，使用更严格的提示重新生成。若幻灯片基本正确但存在局部问题，在可用时使用选定后端的编辑功能。在CLI/API备用模式下，使用

scripts/image_gen.py edit --image {slide_path} --prompt ... --out {new_slide_path}

，验证编辑后的输出后再替换最终幻灯片。

9. Write Speaker Notes

9. 编写演讲备注

Make sure

outline.md

reflects the final confirmed deck outline from step 2. Do not recreate it from scratch here.

Create

speech.md

with speaker notes. Keep it useful and concise: 1-3 short paragraphs per slide is usually enough.

Use headings that the assembly script can map back to slide numbers:

markdown

undefined

确保

outline.md

反映步骤2中最终确认的演示文稿大纲，无需在此从头创建。

创建包含演讲备注的

speech.md

文件。备注应实用且简洁：每张幻灯片通常1-3段简短内容即可。

使用组装脚本可映射到幻灯片编号的标题：

markdown

undefined

Slide 1: {Title}

{Speaker notes for slide 1}

{第1页演讲备注}

Slide 2: {Title}

{Speaker notes for slide 2}

undefined

{第2页演讲备注}

undefined

10. Assemble The PPT

10. 组装PPT

Run:

bash

~/.codex-ppt-skill/.venv/bin/python {skill_root}/scripts/assemble_ppt.py {base_dir} {deck_name}.pptx --aspect-ratio 16:9

Important:

```
{base_dir}
```
is the parent directory of
```
{deck_name}/
```
.
```
{deck_name}.pptx
```
must match the project folder name.
The script reads images from
```
{base_dir}/{deck_name}/origin_image/
```
.
The script only reads final images named like
```
slide_01.png
```
,
```
slide_02.png
```
, etc.; drafts and sample files are ignored.
If
```
{base_dir}/{deck_name}/speech.md
```
exists and uses
```
Slide N
```
headings, the script writes those notes into the corresponding PPT speaker notes.
The script writes
```
{base_dir}/{deck_name}/{deck_name}.pptx
```
.

运行以下命令：

bash

~/.codex-ppt-skill/.venv/bin/python {skill_root}/scripts/assemble_ppt.py {base_dir} {deck_name}.pptx --aspect-ratio 16:9

注意事项：

```
{base_dir}
```
是
```
{deck_name}/
```
的父目录。
```
{deck_name}.pptx
```
必须与项目文件夹名称一致。
脚本从
```
{base_dir}/{deck_name}/origin_image/
```
读取图片。
脚本仅读取类似
```
slide_01.png
```
、
```
slide_02.png
```
的最终图片；草稿和样片文件将被忽略。
若
```
{base_dir}/{deck_name}/speech.md
```
存在且使用
```
Slide N
```
标题，脚本会将这些备注写入对应的PPT演讲备注中。
脚本将PPT文件写入
```
{base_dir}/{deck_name}/{deck_name}.pptx
```
。

11. Final Report

11. 最终报告

Report:

Project directory
PPT file path
Slide image directory
```
outline.md
```
path
```
speech.md
```
path
Number of slides
Confirm which image backend was used: built-in image tool or CLI/API fallback.
Confirm that speaker notes from
```
speech.md
```
were written into the PPT, if applicable
Any slides that were regenerated or still have known limitations

报告内容包括：

项目目录
PPT文件路径
幻灯片图片目录
```
outline.md
```
路径
```
speech.md
```
路径
幻灯片数量
确认使用的图片后端：内置图片工具或CLI/API备用方案
确认
```
speech.md
```
中的演讲备注已写入PPT（若适用）
任何重新生成或仍存在已知限制的幻灯片

Local Script Dependencies

本地脚本依赖

Before running

scripts/assemble_ppt.py

or the CLI/API fallback scripts, make sure the shared runtime exists. If

~/.codex-ppt-skill/.venv/bin/python

is missing, or if importing script dependencies fails, create or refresh the environment:

bash

python3 {skill_root}/scripts/codex_ppt_runtime.py bootstrap

This is an internal setup step for the skill. Do not ask the user to run these commands unless dependency installation fails and user approval or troubleshooting is required.

assemble_ppt.py

supports

16:9

and

4:3

. Use

16:9

unless the user requests otherwise.

image_gen.py

loads

~/.codex-ppt-skill/.env

automatically for

OPENAI_API_KEY

OPENAI_BASE_URL

, and

CODEX_PPT_IMAGE_MODEL

. Run

python3 {skill_root}/scripts/codex_ppt_runtime.py doctor --check-api

when troubleshooting API access.

运行

scripts/assemble_ppt.py

或CLI/API备用脚本前，确保共享运行时环境存在。若

~/.codex-ppt-skill/.venv/bin/python

缺失，或导入脚本依赖失败，创建或更新环境：

bash

python3 {skill_root}/scripts/codex_ppt_runtime.py bootstrap

这是技能的内部设置步骤。除非依赖安装失败且需要用户批准或排查问题，否则不要要求用户运行这些命令。

assemble_ppt.py

支持

16:9

和

4:3

比例。除非用户要求，否则默认使用

16:9

。

image_gen.py

会自动从

~/.codex-ppt-skill/.env

加载

OPENAI_API_KEY

、

OPENAI_BASE_URL

和

CODEX_PPT_IMAGE_MODEL

。排查API访问问题时，运行

python3 {skill_root}/scripts/codex_ppt_runtime.py doctor --check-api

。

Prompting Principles

提示原则

Keep one global visual style fixed across the deck.
Vary slide composition by page role; style consistency does not mean repeating the same layout.
Use
```
layout_blueprints
```
as candidate patterns, not mandatory templates.
Generate one slide per image request.
Prefer concrete visual direction over generic words like "beautiful" or "professional".
For dense content, split across more slides instead of crowding one slide.
Prioritize clarity over decoration.

整个演示文稿保持统一的全局视觉风格。
根据页面角色改变幻灯片构图；风格一致并不意味着重复相同布局。
将
```
layout_blueprints
```
视为候选模式，而非强制模板。
每次图片请求生成一张幻灯片。
优先使用具体的视觉方向，而非“美观”或“专业”等通用词汇。
对于内容密集的情况，拆分到更多幻灯片，而非拥挤在单张幻灯片中。
优先考虑清晰度而非装饰性。