image-generator-sd-webui

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Image Generator (sd-webui API)

图像生成器(sd-webui API)

Overview

概述

Drive a Stable Diffusion WebUI / Forge server through its REST API to enumerate available resources, run
txt2img
, poll progress, and interrupt jobs. All scripts under
scripts/
are thin
curl
wrappers; they print JSON or extracted fields to stdout so the agent can pipe / parse them.
通过REST API控制Stable Diffusion WebUI / Forge服务器,枚举可用资源、运行
txt2img
任务、查询进度并中断任务。
scripts/
目录下的所有脚本都是轻量的
curl
封装;它们会将JSON或提取的字段输出到标准输出,以便Agent进行管道传输或解析。

Server connection

服务器连接

Before doing anything, confirm the server URL (and optional HTTP Basic Auth) with the user. Pass them as environment variables to every script:
bash
export SD_WEBUI_URL="http://localhost:7860"   # required, no trailing slash
export SD_WEBUI_USER=""                       # optional, HTTP Basic Auth
export SD_WEBUI_PASS=""                       # optional
If unset, scripts default to
http://localhost:7860
with no auth.
Quick connectivity test (returns
OK <url>
on success, exits non-zero on failure):
bash
scripts/probe.sh
在执行任何操作之前,请与用户确认服务器URL(以及可选的HTTP基本认证信息)。将它们作为环境变量传递给每个脚本:
bash
export SD_WEBUI_URL="http://localhost:7860"   # 必填,末尾不能带斜杠
export SD_WEBUI_USER=""                       # 可选,HTTP基本认证用户名
export SD_WEBUI_PASS=""                       # 可选,HTTP基本认证密码
如果未设置,脚本默认使用
http://localhost:7860
且不启用认证。
快速连接测试(成功时返回
OK <url>
,失败时返回非零退出码):
bash
scripts/probe.sh

Workflow

工作流程

  1. Probe — Verify the server is reachable (
    scripts/probe.sh
    ). On failure, ask the user for the correct URL / credentials.
  2. Enumerate & choose — List the resources to pick (models, modules, samplers, schedulers, styles) and ask the user to choose. Capture their choice verbatim in the API's English
    name
    /
    title
    /
    model_name
    — sd-webui matches exactly, do not translate or rename.
  3. Prompt — Obtain the positive prompt, negative prompt, and any extra params (steps, CFG, size). See "Prompt engineering" for sourcing these.
  4. Generate — Call
    scripts/generate.sh
    with a request JSON. It returns a JSON object containing the base64 PNG image and the generation
    info
    .
  5. (Optional) Track progress — While generation is running (in another shell / background), call
    scripts/progress.sh
    to print
    progress
    (0–1),
    eta_relative
    , and
    state
    .
  6. (Optional) Cancel — Call
    scripts/cancel.sh
    to interrupt the current job.
  1. 探测 — 验证服务器是否可达(
    scripts/probe.sh
    )。如果失败,请询问用户正确的URL/凭据。
  2. 枚举与选择 — 列出可供选择的资源(模型、模块、采样器、调度器、风格)并让用户选择。请完整保留用户选择的API英文
    name
    /
    title
    /
    model_name
    字段内容 — sd-webui要求完全匹配,请勿翻译或重命名。
  3. 提示词 — 获取正向提示词、负向提示词以及任何额外参数(步数、CFG值、图像尺寸)。提示词相关内容请参考「提示词工程」部分。
  4. 生成 — 使用请求JSON调用
    scripts/generate.sh
    。它会返回一个包含base64格式PNG图像和生成
    info
    信息的JSON对象。
  5. (可选)跟踪进度 — 在生成任务运行期间(在另一个shell/后台执行),调用
    scripts/progress.sh
    来输出
    progress
    (0–1)、
    eta_relative
    state
    信息。
  6. (可选)取消 — 调用
    scripts/cancel.sh
    中断当前任务。

Tasks

任务

Listing available resources

列出可用资源

User wantsCommandAPI endpoint
Checkpoints (models)
scripts/list.sh models
GET /sdapi/v1/sd-models
→ array of
{title, model_name, hash, ...}
Extra modules (TE / VAE, Forge-only)
scripts/list.sh modules
GET /sdapi/v1/sd-modules
→ array of
{model_name, ...}
Samplers
scripts/list.sh samplers
GET /sdapi/v1/samplers
→ array of
{name, aliases}
Schedulers
scripts/list.sh schedulers
GET /sdapi/v1/schedulers
→ array of
{name, label}
Style presets
scripts/list.sh styles
GET /sdapi/v1/prompt-styles
→ array of
{name, prompt, negative_prompt}
Upscalers
scripts/list.sh upscalers
GET /sdapi/v1/upscalers
LoRAs
scripts/list.sh loras
GET /sdapi/v1/loras
Embeddings
scripts/list.sh embeddings
GET /sdapi/v1/embeddings
scripts/list.sh <kind>
prints the canonical English identifier for each entry, one per line — pipe to
column
,
fzf
, etc. Add
--json
for the raw JSON.
After listing, present the options to the user (use
ask_user
with an enum if the list is short). For models, prefer the full
title
(which embeds the hash suffix, e.g.
anima/animaika_v36.safetensors [d50fb5b9a0]
) over
model_name
because the title is unambiguous — if the user supplies a bare filename without the hash, verify it via
list.sh models
and substitute the exact title before sending it to the API. For schedulers,
list.sh schedulers
prints the human-readable
label
(e.g.
Beta
); both
label
and the lowercase
name
(
beta
) are accepted by the txt2img
scheduler
field.
用户需求命令API 端点
模型(Checkpoints)
scripts/list.sh models
GET /sdapi/v1/sd-models
→ 返回
{title, model_name, hash, ...}
数组
额外模块(TE/VAE,仅Forge支持)
scripts/list.sh modules
GET /sdapi/v1/sd-modules
→ 返回
{model_name, ...}
数组
采样器
scripts/list.sh samplers
GET /sdapi/v1/samplers
→ 返回
{name, aliases}
数组
调度器
scripts/list.sh schedulers
GET /sdapi/v1/schedulers
→ 返回
{name, label}
数组
风格预设
scripts/list.sh styles
GET /sdapi/v1/prompt-styles
→ 返回
{name, prompt, negative_prompt}
数组
图像放大工具
scripts/list.sh upscalers
GET /sdapi/v1/upscalers
LoRA模型
scripts/list.sh loras
GET /sdapi/v1/loras
Embeddings模型
scripts/list.sh embeddings
GET /sdapi/v1/embeddings
scripts/list.sh <kind>
会每行输出一个条目的标准英文标识符 — 可通过管道传输给
column
fzf
等工具处理。添加
--json
参数可获取原始JSON数据。
列出资源后,将选项展示给用户(如果列表较短,可使用
ask_user
配合枚举选项)。对于模型,优先使用完整的
title
(包含哈希后缀,例如
anima/animaika_v36.safetensors [d50fb5b9a0]
)而非
model_name
,因为标题具有明确的唯一性 — 如果用户提供的是不带哈希的裸文件名,请通过
list.sh models
验证并替换为精确的标题后再发送给API。对于调度器,
list.sh schedulers
会输出易读的
label
(例如
Beta
);txt2img的
scheduler
字段同时接受
label
和小写的
name
beta
)。

Generating an image (txt2img)

生成图像(txt2img)

  1. Build a JSON request. Required field:
    prompt
    . Recommended:
    negative_prompt
    ,
    steps
    ,
    cfg_scale
    ,
    width
    ,
    height
    ,
    sampler_name
    ,
    scheduler
    ,
    styles
    (array of style names), and
    override_settings.sd_model_checkpoint
    (model title) /
    override_settings.forge_additional_modules
    (array of module names, Forge only). See
    references/txt2img-parameters.md
    for every field.
  2. Run:
    bash
    scripts/generate.sh request.json > result.json
    # or pipe:
    cat request.json | scripts/generate.sh - > result.json
  3. Extract the image (base64 PNG):
    bash
    jq -r '.images[0]' result.json | base64 -d > out.png
  4. The
    info
    field is a JSON string with
    seed
    ,
    all_prompts
    ,
    sampler_name
    , etc. — parse with
    jq -r '.info | fromjson'
    .
Important behaviour notes:
  • samples_format
    pre-pin
    : sd-webui/Forge validates
    samples_format
    before applying
    override_settings
    , so if the server's persistent value is unsupported (e.g.
    avif
    ), txt2img fails.
    generate.sh
    preemptively
    POST
    s
    samples_format=png
    to
    /sdapi/v1/options
    and redundantly injects
    override_settings.samples_format=png
    . ⚠️ The pre-pin mutates the server's persistent default to
    "png"
    override_settings_restore_afterwards
    cannot undo it. If the user shares the server with clients expecting a different default, restore manually after:
    scripts/options.sh set samples_format '"webp"'
    . Convert locally if you need non-PNG output (see "Converting to another format" below).
  • override_settings_restore_afterwards: true
    is forced on by
    generate.sh
    so the other
    override_settings
    keys (model checkpoint, modules, VAE) do not stick.
  • Generation is synchronous — the POST blocks until the image is ready. The script uses a 600s curl timeout; override with
    SD_WEBUI_TIMEOUT=900 scripts/generate.sh ...
    .
  1. 构建JSON请求。必填字段:
    prompt
    。推荐字段:
    negative_prompt
    steps
    cfg_scale
    width
    height
    sampler_name
    scheduler
    styles
    (风格名称数组)以及
    override_settings.sd_model_checkpoint
    (模型标题)/
    override_settings.forge_additional_modules
    (模块名称数组,仅Forge支持)。所有字段详情请参考
    references/txt2img-parameters.md
  2. 运行:
    bash
    scripts/generate.sh request.json > result.json
    # 或通过管道传输:
    cat request.json | scripts/generate.sh - > result.json
  3. 提取图像(base64格式PNG):
    bash
    jq -r '.images[0]' result.json | base64 -d > out.png
  4. info
    字段是包含
    seed
    all_prompts
    sampler_name
    等信息的JSON字符串 — 可通过
    jq -r '.info | fromjson'
    解析。
重要行为说明:
  • samples_format
    预固定
    :sd-webui/Forge会在应用
    override_settings
    之前验证
    samples_format
    ,因此如果服务器的持久化值不被支持(例如
    avif
    ),txt2img任务会失败。
    generate.sh
    会预先向
    /sdapi/v1/options
    发送
    POST
    请求设置
    samples_format=png
    同时会在请求中冗余注入
    override_settings.samples_format=png
    。⚠️ 此预固定操作会将服务器的持久化默认值修改为
    "png"
    override_settings_restore_afterwards
    无法撤销此修改。如果用户与其他期望不同默认格式的客户端共享服务器,请在任务完成后手动恢复:
    scripts/options.sh set samples_format '"webp"'
    。如果需要非PNG格式的输出,请在本地转换(见下文「转换为其他格式」)。
  • generate.sh
    强制启用
    override_settings_restore_afterwards: true
    ,因此其他
    override_settings
    键(模型checkpoint、模块、VAE)不会保留。
  • 生成任务是同步的 — POST请求会阻塞直到图像生成完成。脚本使用600秒的curl超时时间;可通过
    SD_WEBUI_TIMEOUT=900 scripts/generate.sh ...
    覆盖此设置。

Converting to another format

转换为其他格式

If the user wants the output in a non-PNG format (WebP, AVIF, JPEG, etc.), do not try to re-enable a different
samples_format
on the server. Instead, convert locally while preserving the embedded sd-webui generation metadata:
  1. Check whether both
    format-converter.sh
    and
    copy-info.sh
    are available on
    PATH
    (e.g.
    command -v format-converter.sh && command -v copy-info.sh
    ).
  2. If both are present, run
    format-converter.sh
    on the PNG — it calls
    copy-info.sh
    internally to carry the parameters over. Run
    format-converter.sh -h
    to see the current usage.
  3. If either is missing, guide the user to install the helper project once: https://github.com/jim60105/sd-image-format-converter. It has system dependencies that must be set up manually, so it can't be auto-installed. After install, both scripts should be on
    PATH
    and
    -h
    will show usage.
如果用户需要非PNG格式的输出(WebP、AVIF、JPEG等),请勿尝试在服务器上重新启用其他
samples_format
。请在本地转换并保留sd-webui生成的嵌入元数据:
  1. 检查
    format-converter.sh
    copy-info.sh
    是否都在
    PATH
    中(例如执行
    command -v format-converter.sh && command -v copy-info.sh
    )。
  2. 如果两者都存在,对PNG文件运行
    format-converter.sh
    — 它会内部调用
    copy-info.sh
    来传递参数。运行
    format-converter.sh -h
    查看使用说明。
  3. 如果缺少其中任意一个,请引导用户安装辅助项目:https://github.com/jim60105/sd-image-format-converter。该项目有必须手动配置的系统依赖,因此无法自动安装。安装完成后,两个脚本应在
    PATH
    中,且
    -h
    会显示使用说明。

Tracking progress

跟踪进度

Call from another terminal (or background the
generate.sh
call with
&
first):
bash
scripts/progress.sh                  # one-shot, prints JSON
scripts/progress.sh --watch          # poll every 1s until progress reaches 1.0 or state.job is empty
scripts/progress.sh --watch --interval 2
scripts/progress.sh --field progress # just the numeric 0..1 value
scripts/progress.sh --field state.job
Endpoint:
GET /sdapi/v1/progress?skip_current_image=true
. Key response fields:
  • progress
    — float 0..1, fraction of current job complete.
  • eta_relative
    — estimated seconds remaining.
  • state.job
    — current job name (empty string when idle).
  • state.sampling_step
    /
    state.sampling_steps
    — current step index / total.
  • current_image
    — base64 PNG preview of the in-progress image (omitted by the script via
    skip_current_image=true
    to keep responses small; fetch raw with
    curl
    if needed).
从另一个终端调用(或先将
generate.sh
放入后台执行
&
):
bash
scripts/progress.sh                  # 单次查询,输出JSON
scripts/progress.sh --watch          # 每秒轮询一次,直到进度达到1.0或state.job为空
scripts/progress.sh --watch --interval 2
scripts/progress.sh --field progress # 仅输出0..1的数值进度
scripts/progress.sh --field state.job
端点:
GET /sdapi/v1/progress?skip_current_image=true
。关键响应字段:
  • progress
    — 0到1之间的浮点数,表示当前任务的完成比例。
  • eta_relative
    — 预计剩余秒数。
  • state.job
    — 当前任务名称(空闲时为空字符串)。
  • state.sampling_step
    /
    state.sampling_steps
    — 当前步数索引 / 总步数。
  • current_image
    — 生成中图像的base64格式PNG预览(脚本通过
    skip_current_image=true
    省略此字段以减小响应体积;如需获取可直接使用
    curl
    调用)。

Cancelling

取消任务

bash
scripts/cancel.sh         # POST /sdapi/v1/interrupt — stop current job, return current partial result
scripts/cancel.sh --skip  # POST /sdapi/v1/skip — skip current job in a batch
Note:
interrupt
is cooperative — it tells the sampler to stop at the next step. The pending
generate.sh
call will return with whatever the model produced so far (often a usable but partial image). It does not raise an HTTP error on the txt2img call.
bash
scripts/cancel.sh         # POST /sdapi/v1/interrupt — 停止当前任务,返回当前已生成的部分结果
scripts/cancel.sh --skip  # POST /sdapi/v1/skip — 跳过批量任务中的当前任务
注意:
interrupt
协作式的 — 它通知采样器在下一个步骤停止。正在执行的
generate.sh
调用会返回模型当前生成的内容(通常是可用但未完成的图像)。此操作不会在txt2img调用中引发HTTP错误。

Global options (advanced)

全局选项(高级)

scripts/options.sh
wraps
GET /sdapi/v1/options
and
POST /sdapi/v1/options
:
bash
scripts/options.sh get                                # print all options as JSON
scripts/options.sh get sd_model_checkpoint            # print one key
scripts/options.sh set sd_model_checkpoint '"<title>"' # set one key (value is JSON; string must be quoted)
scripts/options.sh set-json '{"k1":"v1","k2":"v2"}'   # set multiple keys
scripts/options.sh refresh-checkpoints                # POST /sdapi/v1/refresh-checkpoints
Prefer
override_settings
inside the
txt2img
request over
options set
override_settings
is request-scoped and reverts after the call, while
options set
persists globally and affects every other client.
scripts/options.sh
封装了
GET /sdapi/v1/options
POST /sdapi/v1/options
接口:
bash
scripts/options.sh get                                # 输出所有选项的JSON
scripts/options.sh get sd_model_checkpoint            # 输出单个键的值
scripts/options.sh set sd_model_checkpoint '"<title>"' # 设置单个键的值(值为JSON格式;字符串必须加引号)
scripts/options.sh set-json '{"k1":"v1","k2":"v2"}'   # 设置多个键的值
scripts/options.sh refresh-checkpoints                # POST /sdapi/v1/refresh-checkpoints
优先在
txt2img
请求中使用
override_settings
而非
options set
override_settings
仅作用于当前请求,调用后会恢复原状,而
options set
会全局持久化,影响所有其他客户端。

Prompt engineering

提示词工程

This skill does not generate or refine prompts. When the user asks for prompt help:
  1. Check whether another agent skill is available for prompt engineering (search by name: e.g.
    sd-prompt-builder
    ,
    danbooru-prompt
    ,
    image-prompt-*
    ). If so, delegate to it.
  2. Otherwise, ask the user for the prompt explicitly, or accept a natural-language description and pass it through verbatim as the
    prompt
    field. Do not invent Danbooru tags or stylistic modifiers on your own.
此技能不负责生成或优化提示词。当用户请求提示词帮助时:
  1. 检查是否有其他Agent技能可用于提示词工程(按名称搜索:例如
    sd-prompt-builder
    danbooru-prompt
    image-prompt-*
    )。如果有,请将任务委托给该技能。
  2. 如果没有,请明确向用户索要提示词,或接受自然语言描述并直接将其作为
    prompt
    字段传递。请勿自行添加Danbooru标签或风格修饰词。

References

参考资料

  • references/api-endpoints.md
    — full sd-webui / Forge endpoint reference with request / response shapes for every endpoint this skill uses, plus useful adjacent ones (
    /sdapi/v1/memory
    ,
    /sdapi/v1/png-info
    , etc.).
  • references/txt2img-parameters.md
    — every
    txt2img
    request field including HiRes-fix, refiner, Forge-specific extensions (
    forge_additional_modules
    ,
    forge_inference_memory
    ,
    forge_preset
    ), and
    override_settings
    keys.
Read these only when constructing a non-trivial request or hitting an error that needs deeper investigation.
  • references/api-endpoints.md
    — 完整的sd-webui / Forge端点参考,包含此技能使用的所有端点的请求/响应格式,以及其他有用的相邻端点(
    /sdapi/v1/memory
    /sdapi/v1/png-info
    等)。
  • references/txt2img-parameters.md
    — 所有
    txt2img
    请求字段,包括高清修复(HiRes-fix)、细化器(refiner)、Forge专属扩展(
    forge_additional_modules
    forge_inference_memory
    forge_preset
    )以及
    override_settings
    键。
仅在构造非平凡请求或遇到需要深入调查的错误时参考这些文档。