Knowledge Comic Creator
Adapted from
baoyu-comic for Hermes Agent's tool ecosystem.
Create original knowledge comics with flexible art style × tone combinations.
When to Use
Trigger this skill when the user asks to create a knowledge/educational comic, biography comic, tutorial comic, or uses terms like "知识漫画", "教育漫画", or "Logicomix-style". The user provides content (text, file path, URL, or topic) and optionally specifies art style, tone, layout, aspect ratio, or language.
Reference Images
Hermes'
tool is
prompt-only — it accepts a text prompt and an aspect ratio, and returns an image URL. It does
NOT accept reference images. When the user supplies a reference image, use it to
extract traits in text that get embedded in every page prompt:
Intake: Accept file paths when the user provides them (or pastes images in conversation).
- File path(s) → copy to alongside the comic output for provenance
- Pasted image with no path → ask the user for the path via , or extract style traits verbally as a text fallback
- No reference → skip this section
Usage modes (per reference):
| Usage | Effect |
|---|
| Extract style traits (line treatment, texture, mood) and append to every page's prompt body |
| Extract hex colors and append to every page's prompt body |
| Extract scene composition or subject notes and append to the relevant page(s) |
Record in each page's prompt frontmatter when refs exist:
yaml
references:
- ref_id: 01
filename: 01-ref-scene.png
usage: style
traits: "muted earth tones, soft-edged ink wash, low-contrast backgrounds"
Character consistency is driven by
text descriptions in
(written in Step 3) that get embedded inline in every page prompt (Step 5). The optional PNG character sheet generated in Step 7.1 is a human-facing review artifact, not an input to
.
Options
Visual Dimensions
| Option | Values | Description |
|---|
| Art | ligne-claire (default), manga, realistic, ink-brush, chalk, minimalist | Art style / rendering technique |
| Tone | neutral (default), warm, dramatic, romantic, energetic, vintage, action | Mood / atmosphere |
| Layout | standard (default), cinematic, dense, splash, mixed, webtoon, four-panel | Panel arrangement |
| Aspect | 3:4 (default, portrait), 4:3 (landscape), 16:9 (widescreen) | Page aspect ratio |
| Language | auto (default), zh, en, ja, etc. | Output language |
| Refs | File paths | Reference images used for style / palette trait extraction (not passed to the image model). See Reference Images above. |
Partial Workflow Options
| Option | Description |
|---|
| Storyboard only | Generate storyboard only, skip prompts and images |
| Prompts only | Generate storyboard + prompts, skip images |
| Images only | Generate images from existing prompts directory |
| Regenerate N | Regenerate specific page(s) only (e.g., or ) |
Details: references/partial-workflows.md
Art, Tone & Preset Catalogue
-
Art styles (6):
,
,
,
,
,
. Full definitions at
references/art-styles/<style>.md
.
-
Tones (7):
,
,
,
,
,
,
. Full definitions at
references/tones/<tone>.md
.
-
Presets (5) with special rules beyond plain art+tone:
| Preset | Equivalent | Hook |
|---|
| manga + neutral | Visual metaphors, no talking heads, gadget reveals |
| ink-brush + action | Qi effects, combat visuals, atmospheric |
| manga + romantic | Decorative elements, eye details, romantic beats |
| manga + warm | Visual symbol system, growth arc, dialogue+action balance |
| minimalist + neutral + four-panel layout | 起承转合 structure, B&W + spot color, stick-figure characters |
Full rules at
references/presets/<preset>.md
— load the file when a preset is picked.
-
Compatibility matrix and content-signal → preset table live in references/auto-selection.md. Read it before recommending combinations in Step 2.
File Structure
- Slug: 2-4 words kebab-case from topic (e.g., )
- Conflict: append timestamp (e.g.,
turing-story-20260118-143052
)
Contents:
| File | Description |
|---|
| Saved source content (kebab-case slug matches the output directory) |
| Content analysis |
| Storyboard with panel breakdown |
| Character definitions |
characters/characters.png
| Character reference sheet (downloaded from ) |
prompts/NN-{cover|page}-[slug].md
| Generation prompts |
NN-{cover|page}-[slug].png
| Generated images (downloaded from ) |
| User-supplied reference images (optional, for provenance) |
Language Handling
Detection Priority:
- User-specified language (explicit option)
- User's conversation language
- Source content language
Rule: Use user's input language for ALL interactions:
- Storyboard outlines and scene descriptions
- Image generation prompts
- User selection options and confirmations
- Progress updates, questions, errors, summaries
Technical terms remain in English.
Workflow
Progress Checklist
Comic Progress:
- [ ] Step 1: Setup & Analyze
- [ ] 1.1 Analyze content
- [ ] 1.2 Check existing directory
- [ ] Step 2: Confirmation - Style & options ⚠️ REQUIRED
- [ ] Step 3: Generate storyboard + characters
- [ ] Step 4: Review outline (conditional)
- [ ] Step 5: Generate prompts
- [ ] Step 6: Review prompts (conditional)
- [ ] Step 7: Generate images
- [ ] 7.1 Generate character sheet (if needed) → characters/characters.png
- [ ] 7.2 Generate pages (with character descriptions embedded in prompt)
- [ ] Step 8: Completion report
Flow
Input → Analyze → [Check Existing?] → [Confirm: Style + Reviews] → Storyboard → [Review?] → Prompts → [Review?] → Images → Complete
Step Summary
| Step | Action | Key Output |
|---|
| 1.1 | Analyze content | , |
| 1.2 | Check existing directory | Handle conflicts |
| 2 | Confirm style, focus, audience, reviews | User preferences |
| 3 | Generate storyboard + characters | , |
| 4 | Review outline (if requested) | User approval |
| 5 | Generate prompts | |
| 6 | Review prompts (if requested) | User approval |
| 7.1 | Generate character sheet (if needed) | characters/characters.png
|
| 7.2 | Generate pages | files |
| 8 | Completion report | Summary |
User Questions
Use the
tool to confirm options. Since
handles one question at a time, ask the most important question first and proceed sequentially. See
references/workflow.md for the full Step 2 question set.
Timeout handling (CRITICAL):
can return
"The user did not provide a response within the time limit. Use your best judgement to make the choice and proceed."
— this is NOT user consent to default everything.
- Treat it as a default for that one question only. Continue asking the remaining Step 2 questions in sequence; each question is an independent consent point.
- Surface the default to the user visibly in your next message so they have a chance to correct it: e.g.
"Style: defaulted to ohmsha preset (clarify timed out). Say the word to switch."
— an unreported default is indistinguishable from never having asked.
- Do NOT collapse Step 2 into a single "use all defaults" pass after one timeout. If the user is genuinely absent, they will be equally absent for all five questions — but they can correct visible defaults when they return, and cannot correct invisible ones.
Step 7: Image Generation
Use Hermes' built-in
tool for all image rendering. Its schema accepts only
and
(
|
|
); it
returns a URL, not a local file. Every generated page or character sheet must therefore be downloaded to the output directory.
Prompt file requirement (hard): write each image's full, final prompt to a standalone file under
(naming:
) BEFORE calling
. The prompt file is the reproducibility record.
Aspect ratio mapping — the storyboard's
field maps to
's format as follows:
| Storyboard ratio | format |
|---|
| , , | |
| , , | |
| |
Download step — after every
call:
- Read the URL from the tool result
- Fetch the image bytes using an absolute output path, e.g.
curl -fsSL "<url>" -o /abs/path/to/comic/<slug>/NN-page-<slug>.png
- Verify the file exists and is non-empty at that exact path before proceeding to the next page
Never rely on shell CWD persistence for paths. The terminal tool's persistent-shell CWD can change between batches (session expiry,
TERMINAL_LIFETIME_SECONDS
, a failed
that leaves you in the wrong directory).
curl -o relative/path.png
is a silent footgun: if CWD has drifted, the file lands somewhere else with no error.
Always pass a fully-qualified absolute path to , or pass
to the terminal tool. Incident Apr 2026: pages 06-09 of a 10-page comic landed at the repo root instead of
because batch 3 inherited a stale CWD from batch 2 and
curl -o 06-page-skills.png
wrote to the wrong directory. The agent then spent several turns claiming the files existed where they didn't.
7.1 Character sheet — generate it (to
characters/characters.png
, aspect
) when the comic is multi-page with recurring characters. Skip for simple presets (e.g., four-panel minimalist) or single-page comics. The prompt file at
must exist before invoking
. The rendered PNG is a
human-facing review artifact (so the user can visually verify character design) and a reference for later regenerations or manual prompt edits — it does
not drive Step 7.2. Page prompts are already written in Step 5 from the
text descriptions in
;
cannot accept images as visual input.
7.2 Pages — each page's prompt MUST already be at
prompts/NN-{cover|page}-[slug].md
before invoking
. Because
is prompt-only, character consistency is enforced by
embedding character descriptions (sourced from ) inline in every page prompt during Step 5. The embedding is done uniformly whether or not a PNG sheet is produced in 7.1; the PNG is only a review/regeneration aid.
Backup rule: existing
and
files → rename with
suffix before regenerating.
Full step-by-step workflow (analysis, storyboard, review gates, regeneration variants): references/workflow.md.
References
Core Templates:
- analysis-framework.md - Deep content analysis
- character-template.md - Character definition format
- storyboard-template.md - Storyboard structure
- ohmsha-guide.md - Ohmsha manga specifics
Style Definitions:
- - Art styles (ligne-claire, manga, realistic, ink-brush, chalk, minimalist)
- - Tones (neutral, warm, dramatic, romantic, energetic, vintage, action)
- - Presets with special rules (ohmsha, wuxia, shoujo, concept-story, four-panel)
- - Layouts (standard, cinematic, dense, splash, mixed, webtoon, four-panel)
Workflow:
- workflow.md - Full workflow details
- auto-selection.md - Content signal analysis
- partial-workflows.md - Partial workflow options
Page Modification
| Action | Steps |
|---|
| Edit | Update prompt file FIRST → regenerate image → download new PNG |
| Add | Create prompt at position → generate with character descriptions embedded → renumber subsequent → update storyboard |
| Delete | Remove files → renumber subsequent → update storyboard |
IMPORTANT: When updating pages, ALWAYS update the prompt file (
prompts/NN-{cover|page}-[slug].md
) FIRST before regenerating. This ensures changes are documented and reproducible.
Pitfalls
- Image generation: 10-30 seconds per page; auto-retry once on failure
- Always download the URL returned by to a local PNG — downstream tooling (and the user's review) expects files in the output directory, not ephemeral URLs
- Use absolute paths for — never rely on persistent-shell CWD across batches. Silent footgun: files land in the wrong directory and subsequent on the intended path shows nothing. See Step 7 "Download step".
- Use stylized alternatives for sensitive public figures
- Step 2 confirmation required - do not skip
- Steps 4/6 conditional - only if user requested in Step 2
- Step 7.1 character sheet - recommended for multi-page comics, optional for simple presets. The PNG is a review/regeneration aid; page prompts (written in Step 5) use the text descriptions in , not the PNG. does not accept images as visual input
- Strip secrets — scan source content for API keys, tokens, or credentials before writing any output file