kling-3-0

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Kling 3.0 - Pro Pack on RunComfy

Kling 3.0 - RunComfy专业版套件

Kling 3.0 is Kuaishou Technology's third-generation cinematic video model. This skill covers all six Kling 3.0 rendering endpoints on RunComfy: three quality tiers (Standard, Pro, 4K) across two modes (text-to-video and image-to-video).
Kling 3.0是快手科技推出的第三代电影级视频模型。该技能覆盖RunComfy平台上Kling 3.0的全部六个渲染端点:包含三种质量层级(Standard、Pro、4K),每种层级均支持文本转视频和图像转视频两种模式。

What Kling 3.0 is

什么是Kling 3.0

Kling 3.0 is the V3 generation of the Kling video model. It produces multi-shot cinematic video with synchronized native audio, consistent character identity across shots, and physics-aware motion. Compared to Kling 2.x, Kling 3.0 supports longer clips (up to 15 seconds), native 4K output on the 4K tier, and a unified multi-prompt segment system that lets one Kling 3.0 generation contain several distinct scenes with controlled transitions.
Kling 3.0 ships in three rendering tiers on RunComfy, each available as text-to-video or image-to-video:
  • Standard - cheapest tier, up to 1080p output. Use Kling 3.0 Standard for fast iteration, previews, A/B variants, social shorts.
  • Pro - highest fidelity at 1080p. Use Kling V3.0 Pro for hero-quality 1080p clips where motion realism and identity preservation matter most.
  • 4K - native 3840x2160 output. Use Kling V3.0 4K for high-resolution brand films, big-screen cinematic sequences, and finished masters at native resolution.
All three tiers share the same Kling 3.0 multi-shot architecture. Tiers differ in resolution ceiling, motion-fidelity budget, and pricing.
Kling 3.0是Kling视频模型的第三代版本。它可生成多镜头电影级视频,具备同步原生音频、镜头间一致的角色身份,以及符合物理规律的运动效果。与Kling 2.x版本相比,Kling 3.0支持更长的视频片段(最长15秒),4K层级可输出原生4K分辨率,还采用了统一的多提示片段系统,让一次Kling 3.0生成任务即可包含多个场景,并支持可控的场景过渡。
在RunComfy平台上,Kling 3.0提供三种渲染层级,每种层级均支持文本转视频和图像转视频模式:
  • Standard(标准版) - 成本最低的层级,最高输出1080p分辨率。适合快速迭代、预览、A/B测试变体、社交短视频等场景。
  • Pro(专业版) - 1080p分辨率下的最高保真度版本。适合需要高运动真实感和角色身份一致性的核心1080p视频片段。
  • 4K(4K版) - 原生3840x2160分辨率输出。适合高分辨率品牌影片、大屏幕电影级序列,以及原生分辨率的成品母片。
三个层级共享相同的Kling 3.0多镜头架构,主要差异在于分辨率上限、运动保真度预算和定价。

The 6 Kling 3.0 endpoints

Kling 3.0的六个端点

Each endpoint corresponds to one (tier, mode) pair. All six endpoints share the same Kling 3.0 base model.
EndpointAnchorResolutionRate (no audio)Rate (with audio)
kling/kling-3.0/standard/text-to-video
Kling 3.0 Standard t2vup to 1080p$0.084/s$0.126/s
kling/kling-3.0/standard/image-to-video
Kling 3.0 Standard Image to Videoup to 1080p$0.084/s$0.126/s
kling/kling-3.0/pro/text-to-video
Kling V3.0 Pro Text-to-Video1080p$0.112/s$0.168/s
kling/kling-3.0/pro/image-to-video
Kling V3.0 Pro Image-to-Video1080p$0.112/s$0.168/s
kling/kling-3.0/4k/text-to-video
Kling V3.0 4K Text-to-Video3840x2160$0.42/s flat$0.42/s flat
kling/kling-3.0/4k/image-to-video
Kling V3.0 4K Image-to-Video3840x2160$0.42/s flat$0.42/s flat
The 4K tier prices the same regardless of audio. Standard and Pro tiers charge ~50% more per second when audio is enabled.
每个端点对应一组(层级,模式)组合,所有六个端点均基于同一Kling 3.0基础模型。
端点锚点分辨率费率(无音频)费率(含音频)
kling/kling-3.0/standard/text-to-video
Kling 3.0 Standard t2v最高1080p$0.084/秒$0.126/秒
kling/kling-3.0/standard/image-to-video
Kling 3.0 Standard图像转视频最高1080p$0.084/秒$0.126/秒
kling/kling-3.0/pro/text-to-video
Kling V3.0 Pro文本转视频1080p$0.112/秒$0.168/秒
kling/kling-3.0/pro/image-to-video
Kling V3.0 Pro图像转视频1080p$0.112/秒$0.168/秒
kling/kling-3.0/4k/text-to-video
Kling V3.0 4K文本转视频3840x2160固定$0.42/秒固定$0.42/秒
kling/kling-3.0/4k/image-to-video
Kling V3.0 4K图像转视频3840x2160固定$0.42/秒固定$0.42/秒
4K层级无论是否启用音频,定价均相同。Standard和Pro层级启用音频后,每秒费用约增加50%。

When to pick which Kling 3.0 tier

如何选择Kling 3.0的层级

Pick a Kling 3.0 tier based on the output's role in the pipeline.
  • Drafts, previews, social shorts, A/B variants: Kling 3.0 Standard. Cheapest. Quality is fine for everything except hero shots.
  • Hero 1080p clips, ad creative, talking heads with high motion fidelity: Kling V3.0 Pro. About 33% more expensive than Standard for noticeably tighter motion and identity hold at the same resolution.
  • 4K brand films, big-screen cinematic, finished masters: Kling V3.0 4K. Native 3840x2160 (no upscale step). Flat $0.42/s makes budgeting predictable. Use only when the output truly needs 4K - it is roughly 5x the cost of Standard.
Pick the mode based on whether you have a source image:
  • Text-to-Video (t2v): prompt only, Kling 3.0 generates the look from scratch. Use Kling 3.0 t2v for novel scenes, brand new compositions, environments without an existing reference.
  • Image-to-Video (i2v): prompt + source image, Kling 3.0 animates the image. Use Kling 3.0 i2v when you have an exact reference (face, product, scene) that must survive into the output.
If the user explicitly asked for Kling 3.0, Kling V3.0, Kling Pro, or Kling 4K, route to this skill regardless.
根据输出内容在生产流程中的作用选择Kling 3.0层级:
  • 草稿、预览、社交短视频、A/B测试变体:选择Kling 3.0 Standard。成本最低,除核心镜头外,质量可满足大多数场景需求。
  • 核心1080p视频片段、广告创意、高运动保真度的对话镜头:选择Kling V3.0 Pro。比Standard贵约33%,但在相同分辨率下,运动效果更流畅,角色身份一致性更好。
  • 4K品牌影片、大屏幕电影内容、成品母片:选择Kling V3.0 4K。原生3840x2160分辨率(无需 upscale 步骤)。固定$0.42/秒的定价让预算更可控。仅当输出内容确实需要4K分辨率时使用,其成本约为Standard的5倍。
根据是否有参考图像选择模式:
  • 文本转视频(t2v):仅需提示词,Kling 3.0从头生成画面。适合全新场景、原创构图、无现有参考的环境。
  • 图像转视频(i2v):需提示词+源图像,Kling 3.0将图像动画化。适合有明确参考(人脸、产品、场景)且需保留到输出内容中的场景。
如果用户明确要求使用Kling 3.0、Kling V3.0、Kling Pro或Kling 4K,无论其他情况如何,均调用该技能。

Prerequisites

前置条件

  1. RunComfy CLI:
    npm i -g @runcomfy/cli
  2. RunComfy account:
    runcomfy login
    opens a browser device-code flow.
  3. CI / containers: set
    RUNCOMFY_TOKEN=<token>
    instead of
    runcomfy login
    .
  4. For i2v endpoints: a publicly fetchable source image URL (HTTPS, JPEG/PNG/WebP).
  1. RunComfy CLI:执行
    npm i -g @runcomfy/cli
    安装
  2. RunComfy账户:执行
    runcomfy login
    将打开浏览器设备码登录流程
  3. CI/容器环境:设置环境变量
    RUNCOMFY_TOKEN=<token>
    替代
    runcomfy login
  4. 使用i2v端点:需提供可公开访问的源图像URL(HTTPS协议,支持JPEG/PNG/WebP格式)

Input schema (shared across all 6 Kling 3.0 endpoints)

输入 schema(所有6个Kling 3.0端点通用)

FieldTypeRequiredDefaultNotes
prompt
stringyes-Text description of scene, motion, camera, atmosphere. Multi-segment prompts supported via
prompt_segments
for scene transitions in one Kling 3.0 generation.
image_url
stringyes (i2v only)-Source image for Kling 3.0 i2v. HTTPS URL. JPEG/PNG/WebP.
tail_image_url
stringno (i2v only)-Optional ending image for controlled start-to-end frame transition on Kling 3.0 i2v.
negative_prompt
stringno-Elements to exclude from the Kling 3.0 output.
duration
intno53-15 seconds per Kling 3.0 generation.
aspect_ratio
enumno
16:9
16:9
,
9:16
,
1:1
,
4:3
,
3:4
,
21:9
.
cfg_scale
floatno0.5Prompt guidance strength. Higher = stricter adherence to prompt.
generate_audio
boolnofalseEnable Kling 3.0 in-pass synchronized audio. Adds cost on Standard and Pro tiers; flat-rate on 4K.
seed
intno-Reproducibility for Kling 3.0 variant testing.
字段类型是否必填默认值说明
prompt
字符串-场景、运动、镜头、氛围的文字描述。通过
prompt_segments
支持多片段提示词,实现一次Kling 3.0生成任务中的场景过渡。
image_url
字符串仅i2v模式必填-Kling 3.0 i2v模式的源图像。需为HTTPS URL,支持JPEG/PNG/WebP格式。
tail_image_url
字符串仅i2v模式可选-可选的结束图像,用于Kling 3.0 i2v模式中控制从起始到结束帧的过渡。
negative_prompt
字符串-需要从Kling 3.0输出内容中排除的元素。
duration
整数5每次Kling 3.0生成任务的时长,范围3-15秒。
aspect_ratio
枚举值
16:9
支持的比例:
16:9
9:16
1:1
4:3
3:4
21:9
cfg_scale
浮点数0.5提示词引导强度。值越高,输出内容越严格遵循提示词。
generate_audio
布尔值false启用Kling 3.0同步原生音频生成。Standard和Pro层级启用后会增加成本;4K层级为固定费率。
seed
整数-用于Kling 3.0变体测试的可复现种子值。

How to invoke each Kling 3.0 endpoint

如何调用各Kling 3.0端点

Kling 3.0 Standard text-to-video (cheapest 1080p draft):
bash
runcomfy run kling/kling-3.0/standard/text-to-video \
  --input '{
    "prompt": "<Kling 3.0 prompt>",
    "duration": 5,
    "aspect_ratio": "16:9"
  }' \
  --output-dir <absolute/path>
Kling 3.0 Standard image-to-video (animate a still):
bash
runcomfy run kling/kling-3.0/standard/image-to-video \
  --input '{
    "prompt": "<motion description for Kling 3.0 i2v>",
    "image_url": "https://.../source.jpg",
    "duration": 5
  }' \
  --output-dir <absolute/path>
Kling V3.0 Pro text-to-video (highest 1080p fidelity):
bash
runcomfy run kling/kling-3.0/pro/text-to-video \
  --input '{
    "prompt": "<Kling 3.0 Pro prompt>",
    "duration": 8,
    "aspect_ratio": "16:9",
    "generate_audio": true
  }' \
  --output-dir <absolute/path>
Kling V3.0 Pro image-to-video (hero animation from source image):
bash
runcomfy run kling/kling-3.0/pro/image-to-video \
  --input '{
    "prompt": "<motion description for Kling V3.0 Pro i2v>",
    "image_url": "https://.../subject.jpg",
    "duration": 8,
    "generate_audio": true
  }' \
  --output-dir <absolute/path>
Kling V3.0 4K text-to-video (native 4K cinematic):
bash
runcomfy run kling/kling-3.0/4k/text-to-video \
  --input '{
    "prompt": "<Kling V3.0 4K prompt>",
    "duration": 10,
    "aspect_ratio": "16:9",
    "generate_audio": true
  }' \
  --output-dir <absolute/path>
Kling V3.0 4K image-to-video (4K animation of a reference image):
bash
runcomfy run kling/kling-3.0/4k/image-to-video \
  --input '{
    "prompt": "<motion description for Kling V3.0 4K i2v>",
    "image_url": "https://.../source-4k.jpg",
    "duration": 10,
    "generate_audio": true
  }' \
  --output-dir <absolute/path>
The CLI submits the Kling 3.0 request, polls every 2s, fetches the result, and downloads any
*.runcomfy.net
/
*.runcomfy.com
URL into
--output-dir
.
Kling 3.0 Standard文本转视频(成本最低的1080p草稿):
bash
runcomfy run kling/kling-3.0/standard/text-to-video \
  --input '{
    "prompt": "<Kling 3.0提示词>",
    "duration": 5,
    "aspect_ratio": "16:9"
  }' \
  --output-dir <绝对路径>
Kling 3.0 Standard图像转视频(将静态图像动画化):
bash
runcomfy run kling/kling-3.0/standard/image-to-video \
  --input '{
    "prompt": "<Kling 3.0 i2v模式的运动描述>",
    "image_url": "https://.../source.jpg",
    "duration": 5
  }' \
  --output-dir <绝对路径>
Kling V3.0 Pro文本转视频(最高保真度的1080p输出):
bash
runcomfy run kling/kling-3.0/pro/text-to-video \
  --input '{
    "prompt": "<Kling 3.0 Pro提示词>",
    "duration": 8,
    "aspect_ratio": "16:9",
    "generate_audio": true
  }' \
  --output-dir <绝对路径>
Kling V3.0 Pro图像转视频(基于源图像生成核心动画):
bash
runcomfy run kling/kling-3.0/pro/image-to-video \
  --input '{
    "prompt": "<Kling V3.0 Pro i2v模式的运动描述>",
    "image_url": "https://.../subject.jpg",
    "duration": 8,
    "generate_audio": true
  }' \
  --output-dir <绝对路径>
Kling V3.0 4K文本转视频(原生4K电影级输出):
bash
runcomfy run kling/kling-3.0/4k/text-to-video \
  --input '{
    "prompt": "<Kling V3.0 4K提示词>",
    "duration": 10,
    "aspect_ratio": "16:9",
    "generate_audio": true
  }' \
  --output-dir <绝对路径>
Kling V3.0 4K图像转视频(将参考图像生成为4K动画):
bash
runcomfy run kling/kling-3.0/4k/image-to-video \
  --input '{
    "prompt": "<Kling V3.0 4K i2v模式的运动描述>",
    "image_url": "https://.../source-4k.jpg",
    "duration": 10,
    "generate_audio": true
  }' \
  --output-dir <绝对路径>
CLI会提交Kling 3.0请求,每2秒轮询一次状态,获取生成结果,并将所有
*.runcomfy.net
/
*.runcomfy.com
链接的内容下载到
--output-dir
指定的目录中。

Prompting Kling 3.0 - what works

Kling 3.0提示词技巧

Kling 3.0 responds to specific prompting patterns better than naive prose.
Lead with motion and camera language. Kling 3.0 reads "wide shot, slow push-in", "tracking shot, low angle", "handheld follow" as real directives. Front-load these.
Multi-shot in one Kling 3.0 generation. A single Kling 3.0 prompt can describe a sequence of shots. Number them: "Shot 1: wide of the cafe at dusk. Shot 2: medium close-up of the barista. Shot 3: tight on the espresso pour." Kling 3.0 will preserve identity (face, wardrobe, props) across the shots.
Identity anchors for i2v. When using Kling 3.0 i2v, restate what should remain stable: "preserve the subject's face, pose, and clothing; only the camera moves and the background changes."
tail_image_url
for controlled endings.
On Kling 3.0 i2v, supply a tail image to lock the final frame. Kling 3.0 will interpolate motion from source to tail.
generate_audio: true
for one-pass dialogue.
Describe what Kling 3.0 should produce in audio: "warm friendly tone, English voiceover" or "city ambience, distant traffic, no dialogue." Audio adds cost on Standard / Pro; flat on 4K.
cfg_scale
tuning.
Default 0.5 works for most Kling 3.0 prompts. Raise to 0.7-0.9 for strict prompt adherence on stylized output. Lower to 0.3-0.4 for natural motion when the prompt is loose.
Anti-patterns:
  • Conflicting style cues in one Kling 3.0 prompt -> simplify, pick one or two style anchors.
  • Asking for greater than 15 seconds in one Kling 3.0 call -> 422 error; segment the script and stitch.
  • Aspect ratios outside the supported set -> rejected.
  • For Kling V3.0 4K, demanding aggressive multi-shot story plus 15s plus dialogue plus 6 cuts -> Kling 3.0 will deliver, but cost climbs to about $6.30 per generation. Validate with Standard first.
Kling 3.0对特定提示词模式的响应优于普通描述性文字。
以运动和镜头语言开头。Kling 3.0会将“广角镜头,缓慢推近”、“跟拍镜头,低角度”、“手持跟随拍摄”视为真实指令。将这类内容放在提示词开头。
一次生成多镜头内容。单个Kling 3.0提示词可以描述一系列镜头。为镜头编号:“镜头1:黄昏时分咖啡馆的广角画面。镜头2:情侣碰杯的中景双人镜头。镜头3:浓缩咖啡萃取的特写。”Kling 3.0会在各镜头间保留角色身份(面部、服装、道具)。
i2v模式的身份锚点。使用Kling 3.0 i2v模式时,明确说明需要保持稳定的元素:“保留主体的面部、姿势和服装;仅镜头移动,背景变化。”
使用
tail_image_url
实现可控结尾
。在Kling 3.0 i2v模式中,提供结尾图像可锁定最终帧,Kling 3.0会生成从源图像到结尾图像的过渡运动。
启用
generate_audio: true
生成一站式对话
。在提示词中描述Kling 3.0应生成的音频内容:“温暖友好的语气,英文旁白”或“城市环境音,远处交通声,无对话”。Standard/Pro层级启用音频会增加成本;4K层级为固定费率。
调整
cfg_scale
参数
。默认值0.5适用于大多数Kling 3.0提示词。对于风格化输出,可将值提高到0.7-0.9以严格遵循提示词;当提示词较宽泛时,可将值降低到0.3-0.4以获得更自然的运动效果。
反模式:
  • 单个Kling 3.0提示词中包含冲突的风格线索 -> 简化提示词,选择1-2个风格锚点。
  • 单次Kling 3.0调用要求时长超过15秒 -> 会返回422错误;将脚本分段后生成再拼接。
  • 使用不支持的宽高比 -> 请求会被拒绝。
  • 对Kling V3.0 4K要求复杂多镜头故事+15秒时长+对话+6次转场 -> Kling 3.0可以生成,但单次生成成本约为$6.30。建议先用Standard层级验证效果。

Where Kling 3.0 shines

Kling 3.0的优势场景

Use caseBest Kling 3.0 endpoint
Cinematic 1080p brand stories with consistent charactersKling V3.0 Pro (t2v or i2v)
Native 4K hero films and big-screen cinematicKling V3.0 4K (t2v or i2v)
Cheap iteration, social-first shorts, A/B variantsKling 3.0 Standard t2v
Animating brand assets, product photos, character artKling 3.0 Standard i2v or Kling V3.0 Pro i2v
Multi-shot ads with synchronized dialogue in one passKling V3.0 Pro with
generate_audio: true
Premium 4K finished masters with native audioKling V3.0 4K with
generate_audio: true
(flat rate)
使用场景最佳Kling 3.0端点
具备一致角色的电影级1080p品牌故事Kling V3.0 Pro(t2v或i2v模式)
原生4K核心影片和大屏幕电影内容Kling V3.0 4K(t2v或i2v模式)
低成本迭代、社交优先短视频、A/B测试变体Kling 3.0 Standard t2v模式
品牌资产、产品照片、角色艺术的动画化Kling 3.0 Standard i2v模式或Kling V3.0 Pro i2v模式
一站式生成带同步对话的多镜头广告启用
generate_audio: true
的Kling V3.0 Pro
带原生音频的高端4K成品母片启用
generate_audio: true
的Kling V3.0 4K(固定费率)

Sample Kling 3.0 prompts

Kling 3.0提示词示例

Kling 3.0 cinematic multi-shot (Pro tier recommended):
Cinematic multi-shot of a young American couple celebrating their
anniversary at a candlelit rooftop restaurant. Shot 1: wide of the
city skyline at golden hour. Shot 2: medium two-shot, the couple
toasting. Shot 3: tight on the woman's smile, soft bokeh, warm fill
light. Subtle ambient string music, gentle wind, distant traffic.
Kling 3.0 i2v (animate a portrait, 4K tier):
Gentle camera dolly-in on the subject from the source image. Subtle
breathing motion, identity-stable features, soft natural light,
shallow depth of field. Background: warm golden-hour glow with a
slow drift of dust motes. No dialogue, only ambient room tone.
Kling 3.0 vertical short (Standard tier, 9:16):
9:16 vertical. A barista in a black apron pulls a single espresso
shot, steam rising into morning sun, rich crema slowly forming.
Close-up handheld, shallow depth of field, warm cafe ambience and
the hiss of the steam wand.
Kling 3.0电影级多镜头(推荐使用Pro层级):
一对年轻美国夫妇在烛光屋顶餐厅庆祝周年纪念日的电影级多镜头画面。镜头1:黄金时段城市天际线的广角画面。镜头2:情侣碰杯的中景双人镜头。镜头3:女士微笑的特写,柔和散景,暖色调补光。轻柔的环境弦乐,微风,远处交通声。
Kling 3.0 i2v模式(肖像动画,4K层级):
镜头从源图像的主体处缓慢推进。细微的呼吸动作,身份稳定的面部特征,柔和自然光,浅景深。背景:温暖的黄金时段光晕,尘埃缓慢飘动。无对话,仅环境室内音。
Kling 3.0竖屏短视频(Standard层级,9:16比例):
9:16竖屏画面。穿黑色围裙的咖啡师萃取一杯浓缩咖啡,蒸汽在晨光中升起,浓郁的油脂慢慢形成。手持特写,浅景深,温暖的咖啡馆环境音和蒸汽棒的嘶嘶声。

Kling 3.0 FAQ

Kling 3.0常见问题

What is the maximum duration of a Kling 3.0 clip? 15 seconds per generation across all three tiers. For longer narratives, segment the script into multiple Kling 3.0 calls and stitch.
How is Kling V3.0 4K priced compared to Standard and Pro? Kling V3.0 4K is a flat $0.42 per second whether or not audio is enabled. Standard is $0.084/s without audio (cheapest). Pro is $0.112/s without audio. The 4K tier costs roughly 5x Standard for the resolution upgrade.
Does Kling 3.0 support multi-shot in a single generation? Yes. All Kling 3.0 endpoints accept multi-segment prompts. Number the shots ("Shot 1:", "Shot 2:", etc.) and Kling 3.0 will preserve character identity across them.
Can Kling 3.0 generate audio? Yes. Set
generate_audio: true
. Kling 3.0 produces synchronized dialogue, ambient sound, and music in the same generation pass. On 4K the price stays flat at $0.42/s; on Standard / Pro the rate jumps about 50% with audio.
What aspect ratios does Kling 3.0 support? 16:9, 9:16, 1:1, 4:3, 3:4, 21:9. The 4K tier renders 21:9 as wide cinema crops at native 3840x2160.
Does Kling 3.0 i2v support a tail image? Yes.
tail_image_url
locks the final frame; Kling 3.0 interpolates motion from source to tail.
How is Kling 3.0 different from Kling 2.x? Kling 3.0 has stronger multi-shot identity preservation, longer max duration (15s vs 10s on the 2.x flagship), native 4K on the 4K tier, and unified multi-prompt segment input across all tiers.
Kling 3.0单条视频的最长时长是多少? 所有层级单次生成的最长时长均为15秒。如需更长叙事内容,将脚本分段后多次调用Kling 3.0再拼接。
Kling V3.0 4K的定价与Standard、Pro相比有何不同? Kling V3.0 4K无论是否启用音频,均为固定$0.42/秒。Standard无音频时为$0.084/秒(成本最低)。Pro无音频时为$0.112/秒。4K层级的分辨率升级成本约为Standard的5倍。
Kling 3.0支持单次生成多镜头内容吗? 是的。所有Kling 3.0端点均支持多片段提示词。为镜头编号(如“镜头1:”、“镜头2:”等),Kling 3.0会在各镜头间保留角色身份。
Kling 3.0可以生成音频吗? 是的。设置
generate_audio: true
即可。Kling 3.0可在同一次生成任务中同步生成对话、环境音和音乐。4K层级定价保持固定$0.42/秒;Standard/Pro层级启用音频后费率约上涨50%。
Kling 3.0支持哪些宽高比? 16:9、9:16、1:1、4:3、3:4、21:9。4K层级会将21:9比例渲染为原生3840x2160分辨率的宽屏裁剪画面。
Kling 3.0 i2v模式支持结尾图像吗? 是的。
tail_image_url
可锁定最终帧;Kling 3.0会生成从源图像到结尾图像的过渡运动。
Kling 3.0与Kling 2.x有何不同? Kling 3.0具备更强的多镜头身份一致性,最长时长更长(15秒,而2.x旗舰版为10秒),4K层级支持原生4K输出,且所有层级均采用统一的多提示片段输入系统。

Limitations

局限性

  • Per-call duration cap 15 seconds on every Kling 3.0 tier.
  • Maximum 6 continuous shots in one Kling 3.0 4K generation.
  • i2v requires a publicly fetchable HTTPS image URL. Local files are not supported.
  • Aspect ratios are fixed to the documented six. Other ratios get cropped or rejected.
  • 4K output files are large. Plan disk and bandwidth before batch Kling V3.0 4K runs.
  • 单次调用时长上限15秒:所有Kling 3.0层级均受此限制。
  • 单次4K生成最多6个连续镜头:Kling 3.0 4K层级单次生成任务中最多支持6个连续镜头。
  • i2v模式需可公开访问的HTTPS图像URL:不支持本地文件。
  • 宽高比固定:仅支持文档中列出的六种宽高比,其他比例会被裁剪或拒绝。
  • 4K输出文件体积大:批量运行Kling V3.0 4K生成任务前,需规划好磁盘空间和带宽。

Exit codes

退出码

The
runcomfy
CLI uses sysexits-style codes:
codemeaning
0Kling 3.0 generation succeeded
64bad CLI args
65bad input JSON for Kling 3.0 / schema mismatch
69upstream 5xx
75retryable: timeout / 429
77not signed in or token rejected
runcomfy
CLI使用sysexits风格的退出码:
代码含义
0Kling 3.0生成成功
64CLI参数错误
65Kling 3.0输入JSON错误/schema不匹配
69上游服务5xx错误
75可重试错误:超时/429
77未登录或令牌被拒绝

How it works

工作原理

  1. The skill picks one of six Kling 3.0 endpoints based on the user's tier (Standard / Pro / 4K) and mode (t2v / i2v) intent.
  2. It invokes
    runcomfy run kling/kling-3.0/<tier>/<mode>
    with a JSON body matching the schema.
  3. The CLI POSTs to the RunComfy Model API with the user's bearer token.
  4. The Model API returns a
    request_id
    ; the CLI polls every 2 seconds until the Kling 3.0 generation finishes.
  5. On terminal status, the CLI fetches the Kling 3.0 result and downloads any
    .runcomfy.net
    /
    .runcomfy.com
    URL into
    --output-dir
    .
  6. Ctrl-C
    cancels the in-flight Kling 3.0 request before billing.
  1. 技能根据用户的层级(Standard/Pro/4K)和模式(t2v/i2v)意图,选择六个Kling 3.0端点中的一个。
  2. 使用匹配schema的JSON体调用
    runcomfy run kling/kling-3.0/<tier>/<mode>
  3. CLI使用用户的Bearer令牌向RunComfy模型API发送POST请求。
  4. 模型API返回
    request_id
    ;CLI每2秒轮询一次,直到Kling 3.0生成完成。
  5. 生成完成后,CLI获取Kling 3.0结果,并将所有
    .runcomfy.net
    /
    .runcomfy.com
    链接的内容下载到
    --output-dir
    指定目录。
  6. 按下
    Ctrl-C
    可取消正在进行的Kling 3.0请求,避免扣费。

Security & Privacy

安全与隐私

  • Token storage:
    runcomfy login
    writes the API token to
    ~/.config/runcomfy/token.json
    with mode 0600. Set
    RUNCOMFY_TOKEN
    env var in CI / containers.
  • Input boundary: the Kling 3.0 prompt is passed as JSON via
    --input
    . The CLI does not shell-expand. No shell-injection surface.
  • Third-party content: image URLs you pass are fetched by the RunComfy server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any video model that accepts image inputs.
  • Outbound endpoints: only
    model-api.runcomfy.net
    (request submission) and
    *.runcomfy.net
    /
    *.runcomfy.com
    (download whitelist).
  • Generated-file size cap: the CLI aborts any single download greater than 2 GiB to prevent disk-fill from a runaway Kling 3.0 4K output.
  • 令牌存储
    runcomfy login
    会将API令牌写入
    ~/.config/runcomfy/token.json
    ,权限为0600。在CI/容器环境中,可设置环境变量
    RUNCOMFY_TOKEN
  • 输入边界:Kling 3.0提示词通过
    --input
    以JSON格式传递,CLI不会进行shell扩展,无shell注入风险。
  • 第三方内容:你传入的图像URL由RunComfy服务器获取,而非本地CLI。请将外部URL视为不可信;基于图像的提示词注入是所有接受图像输入的视频模型的已知风险。
  • 出站端点:仅允许访问
    model-api.runcomfy.net
    (请求提交)和
    *.runcomfy.net
    /
    *.runcomfy.com
    (下载白名单)。
  • 生成文件大小限制:CLI会中止任何超过2 GiB的单个下载,以防止失控的Kling 3.0 4K输出占满磁盘。