luma-workflow-viral-remix

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

爆款仿写 Workflow

爆款仿写工作流

Use this skill when the user wants an agent to turn a topic, account role, or viral reference into a spoken short video.

Read these first when needed:

```
../luma-shared/SKILL.md
```
for common project, artifact, auth, and failure rules.
```
../luma-content-research/SKILL.md
```
for topic search and keyword tables.
```
../luma-material/SKILL.md
```
for local material groups and PIP matching.
```
../luma-digital-human/SKILL.md
```
for voice, TTS, avatar, and lip-sync.
```
../luma-subtitle/SKILL.md
```
for subtitle rendering.

当用户希望借助智能体将某个主题、账号人设或爆款参考内容转化为口播短视频时，可使用本技能。

必要时请先阅读以下文档：

```
../luma-shared/SKILL.md
```
：包含通用项目、产物、授权及故障处理规则。
```
../luma-content-research/SKILL.md
```
：主题搜索与关键词表相关规则。
```
../luma-material/SKILL.md
```
：本地素材组与PIP匹配规则。
```
../luma-digital-human/SKILL.md
```
：语音、TTS、数字人形象及唇形同步相关规则。
```
../luma-subtitle/SKILL.md
```
：字幕渲染规则。

When To Use

使用场景

The user asks for 爆款仿写, 对标视频, 口播短视频, 种草视频, or a complete video production run.
The user wants all intermediate files so the agent can inspect and iterate.
The expected output is a produced video plus cover, not just a script.

Do not use this workflow when the user only asks for one atomic operation such as TTS, subtitle, or material search.

用户要求进行爆款仿写、对标视频制作、口播短视频制作、种草视频制作，或需要完整的视频生产流程。
用户需要所有中间文件，以便智能体检查和迭代优化。
预期输出为成品视频加封面，而非仅脚本。

当用户仅要求单一原子操作（如TTS、字幕制作或素材搜索）时，请勿使用本工作流。

文案是根基：严禁跳过 Research

文案是根基：严禁跳过调研环节（Research）

爆款仿写的核心是"仿写"，不是"原创"。文案的选题、结构、节奏必须基于真实爆款数据，绝不能凭 AI 自己拍脑袋编。

爆款仿写的核心是"仿写"，而非"原创"。文案的选题、结构、节奏必须基于真实爆款数据，绝不能仅凭AI凭空创作。

为什么 Step 0 (Research) 不能跳过

为什么不能跳过第0步（调研）

没有数据支撑的文案是盲猜。你不知道什么选题正在爆、什么结构观众买单、什么钩子点击率高。
仿写的前提是有对标。Step 0 输出的是：热门关键词、对标视频链接、爆款标题、点赞量、口播/非口播分类。这些信息决定了 Step 1 写什么。
跳过 Step 0 直接自己写 = 把"仿写"变成了"盲写"。后面 TTS、lipsync、字幕、BGM 做得再好，方向错了全白费。

没有数据支撑的文案属于盲猜。你无法知晓当前哪些选题正在走红、何种结构更受观众认可、什么样的钩子点击率更高。
仿写的前提是有对标参考。第0步的输出包括：热门关键词、对标视频链接、爆款标题、点赞量、口播/非口播分类。这些信息决定了第1步的创作方向。
跳过第0步直接创作=把"仿写"变成"盲写"。后续TTS、唇形同步、字幕、BGM做得再好，方向错误则前功尽弃。

Agent 执行规则（强制）

智能体执行规则（强制）

Step 0 Research 不可跳过。 不管用户有没有明确要求，必须先跑
```
research run
```
。如果用户说"随便写一个"，你要拒绝，告诉他需要数据支撑选题。
Step 1 必须基于精选对标视频转写。 Research 只负责发现候选爆款，不能直接当 source script。必须先筛选候选，只选择最多 3 个高讨论度、高点赞、强相关、且大概率是口播/讲述型的视频下载并 ASR。source script 必须能追溯到这些精选转写结果。
禁止 AI 自己编 source script。 不允许只看标题、点赞量、关键词就凭空写
```
source_script.txt
```
。如果对标视频无法下载或 ASR 失败，必须换参考视频；仍无法获得转写时，暂停并向用户说明无法完成"仿写"依据。
Research 结果要展示给用户。 跑完 Step 0 后，列出找到的关键词、Top 3 对标视频（标题+点赞量），让用户知道文案的选题依据是什么。

第0步调研不可跳过。 无论用户是否明确要求，必须先执行
```
research run
```
。如果用户说"随便写一个"，你需拒绝并告知需要数据支撑选题。
第1步必须基于精选对标视频转写内容。 调研仅负责发现候选爆款，不能直接作为源脚本。必须先筛选候选视频，最多选择3个高讨论度、高点赞、强相关且大概率为口播/讲述型的视频进行下载并转写（ASR）。源脚本必须可追溯至这些精选转写结果。
禁止AI凭空编写源脚本。 不允许仅通过标题、点赞量、关键词就凭空生成
```
source_script.txt
```
。如果对标视频无法下载或ASR失败，必须更换参考视频；若仍无法获取有效转写内容，需暂停操作并向用户说明无法完成"仿写"的原因。
调研结果需展示给用户。 完成第0步后，列出找到的关键词、Top 3对标视频（标题+点赞量），让用户知晓文案选题的依据。

Standard Files

标准文件

```
step0_content_research.json
```
```
step0_content_research.csv
```
```
step0_keywords.json
```
```
step0_keywords.csv
```
```
references/ref_01.mp4
```
```
references/ref_01_asr.json
```
```
references/ref_02.mp4
```
```
references/ref_02_asr.json
```
```
source_reference_bundle.md
```
```
source_script.txt
```
```
step1_rewrite.json
```
```
transcript.txt
```
```
step2_tts.wav
```
```
step3_lipsync.mp4
```
```
step4_segments.json
```
```
step4_scene_units.json
```
```
step4_materials_enriched.json
```
```
step4_material_matches.json
```
```
step4_picture_in_picture_plan.json
```
```
step4_picture_in_picture.mp4
```
```
step5_subtitle.mp4
```
```
step6_bgm.mp4
```
```
step7_covers/cover_manifest.json
```
```
step7_covers/cover_01.jpg
```

```
step0_content_research.json
```
```
step0_content_research.csv
```
```
step0_keywords.json
```
```
step0_keywords.csv
```
```
references/ref_01.mp4
```
```
references/ref_01_asr.json
```
```
references/ref_02.mp4
```
```
references/ref_02_asr.json
```
```
source_reference_bundle.md
```
```
source_script.txt
```
```
step1_rewrite.json
```
```
transcript.txt
```
```
step2_tts.wav
```
```
step3_lipsync.mp4
```
```
step4_segments.json
```
```
step4_scene_units.json
```
```
step4_materials_enriched.json
```
```
step4_material_matches.json
```
```
step4_picture_in_picture_plan.json
```
```
step4_picture_in_picture.mp4
```
```
step5_subtitle.mp4
```
```
step6_bgm.mp4
```
```
step7_covers/cover_manifest.json
```
```
step7_covers/cover_01.jpg
```

Flow

工作流步骤

Create or select a project:

bash

luma-cli project create viral-remix
luma-cli project use viral-remix

Research references:
bash
```
luma-cli research run --role "<role_or_topic>" --mode precise --date-range 7d --output step0_content_research.json
luma-cli research export --input step0_content_research.json --output step0_content_research.csv
luma-cli research keywords --input step0_content_research.json --output step0_keywords.json --csv step0_keywords.csv
```
Inspect the JSON/CSV and build a shortlist before spending ASR credits. Do not download every result.

Selection rules:
- Choose at most 3 references.
- Prefer videos with high likes, strong discussion potential, clear controversy/curiosity, and close fit to the user's topic/persona.
- Prefer
```
content_type=口播
```
  or videos whose title/description suggests spoken explanation, opinion, review, teaching, story, or analysis.
- Skip pure music/card-point edits, scenery montages, product-only showcases, dance clips, meme clips, or any video likely to have little reusable spoken copy.
- If all high-like videos are non-spoken, pick a topic cluster only and rerun research with a more口播-oriented role/query; do not ASR weak references just to fill the quota.
- Record the shortlist and rejection reasons in
```
source_reference_bundle.md
```
  before downloading.
Download and transcribe the chosen references:
bash
```
mkdir -p references
luma-cli --json social download "<reference_1_link>" --output references/ref_01.mp4
luma-cli asr references/ref_01.mp4 --language zh --output references/ref_01_asr.json
luma-cli --json social download "<reference_2_link>" --output references/ref_02.mp4
luma-cli asr references/ref_02.mp4 --language zh --output references/ref_02_asr.json
```
If using a third reference, save it as
```
references/ref_03.mp4
```
and
```
references/ref_03_asr.json
```
.
```
asr
```
accepts video directly, so a separate local audio-extraction step is not required unless ASR fails on the video file.
After each ASR result, check whether the transcript is actually useful. If the transcript is empty, mostly music/noise, too short to reveal structure, or unrelated to the title, discard that reference and choose another shortlisted video. Stop once 1-3 strong transcripts are available; do not keep spending ASR on weak candidates.
Build the source material for rewrite:
- Read each
```
references/ref_XX_asr.json
```
  .
- Extract the original transcript, hook, argument structure, emotional turn, punchlines, and CTA.
- Write
```
source_reference_bundle.md
```
  with the selected references, original titles/links, transcript excerpts, and the reason each reference is worth copying.
- Write
```
source_script.txt
```
  as a grounded source brief: include the user's persona/positioning, the 2-3 reference viewpoints to fuse, reusable structures/hooks from the transcripts, and explicit constraints. Do not invent claims that are absent from the reference transcripts or user brief.
Rewrite the grounded source script:
bash
```
luma-cli script rewrite --input source_script.txt --length short --output step1_rewrite.json
```
Save the rewritten text as
```
transcript.txt
```
for later subtitle steps (avoids redundant ASR).

Generate speech from the rewritten text:

bash

luma-cli --json tts --file transcript.txt --voice 男声3 --speech-rate 1.1 --output step2_tts.wav

The

--json

flag outputs

audio_object_key

which can be passed directly to lipsync, avoiding a redundant upload.

Generate digital-human video (use

--audio-key

to reference the cloud audio directly):

bash

luma-cli lipsync --avatar 数字人男 --audio-key <audio_object_key> --random-start --output step3_lipsync.mp4

--audio-key

is omitted, lipsync falls back to the project's

latest_tts_key

, then to

--audio

file upload.

Segment text and build scene units:

bash

luma-cli subtitle transcript.txt --text --segments-output step4_segments.json --no-effects --no-highlight
luma-cli pip scene --segments step4_segments.json --output step4_scene_units.json

Prepare and match local PIP materials:

bash

luma-cli material group describe vlm_ai --output step4_materials_enriched.json
luma-cli pip match --scenes step4_scene_units.json --materials step4_materials_enriched.json --mode auto --output step4_material_matches.json

Plan and render PIP:

bash

luma-cli pip plan --segments step4_segments.json --materials step4_materials_enriched.json --match-mode auto --output step4_picture_in_picture_plan.json
luma-cli pip render step3_lipsync.mp4 --plan step4_picture_in_picture_plan.json --output step4_picture_in_picture.mp4

If no insert is matched, continue with

step3_lipsync.mp4

as the subtitle input.

Add subtitles (uses
```
--transcript
```
to skip ASR since we already have the exact script):

bash

luma-cli subtitle step4_picture_in_picture.mp4 --transcript transcript.txt --output step5_subtitle.mp4

Add BGM:

bash

luma-cli bgm mix step5_subtitle.mp4 --output step6_bgm.mp4

Create a cover:
bash
```
luma-cli cover generate step4_picture_in_picture.mp4 --title "<cover_title>" --subtitle "<cover_subtitle>" --count 12 --output-dir step7_covers
```
Cover source rule: use a clean visual video before burned subtitles and BGM. Prefer
```
step4_picture_in_picture.mp4
```
; if PIP was skipped, use
```
step3_lipsync.mp4
```
. Never use
```
step5_subtitle.mp4
```
or
```
step6_bgm.mp4
```
as the cover source, because burned subtitles will become part of the cover background.

创建或选择项目：

bash

luma-cli project create viral-remix
luma-cli project use viral-remix

调研参考内容：
bash
```
luma-cli research run --role "<role_or_topic>" --mode precise --date-range 7d --output step0_content_research.json
luma-cli research export --input step0_content_research.json --output step0_content_research.csv
luma-cli research keywords --input step0_content_research.json --output step0_keywords.json --csv step0_keywords.csv
```
在消耗ASR credits前，先检查JSON/CSV文件并筛选出候选视频，请勿下载所有结果。

筛选规则：
- 最多选择3个参考视频。
- 优先选择点赞量高、讨论潜力大、具备明确争议性/好奇心点且与用户主题/人设高度契合的视频。
- 优先选择
```
content_type=口播
```
  或标题/描述表明为讲解、观点、评测、教学、故事或分析类的视频。
- 跳过纯音乐/卡点剪辑、风景集锦、纯产品展示、舞蹈片段、表情包剪辑或任何几乎无复用口播文案价值的视频。
- 如果所有高赞视频均为非口播类型，仅选择主题集群并重新运行调研，使用更偏向口播的角色/查询；请勿为了凑数而对质量不佳的参考视频进行ASR。
- 在下载前，将候选列表及淘汰原因记录在
```
source_reference_bundle.md
```
  中。
下载并转写选中的参考视频：
bash
```
mkdir -p references
luma-cli --json social download "<reference_1_link>" --output references/ref_01.mp4
luma-cli asr references/ref_01.mp4 --language zh --output references/ref_01_asr.json
luma-cli --json social download "<reference_2_link>" --output references/ref_02.mp4
luma-cli asr references/ref_02.mp4 --language zh --output references/ref_02_asr.json
```
如果使用第三个参考视频，请保存为
```
references/ref_03.mp4
```
和
```
references/ref_03_asr.json
```
。
```
asr
```
命令可直接处理视频文件，因此无需单独提取音频，除非视频文件无法完成ASR。
每次获取ASR结果后，检查转写内容是否有用。如果转写内容为空、大部分是音乐/噪音、篇幅过短无法体现结构或与标题无关，丢弃该参考视频并选择另一个候选视频。一旦获得1-3份高质量转写内容即可停止，请勿继续在质量不佳的候选视频上消耗ASR资源。
构建改写所需的源素材：
- 读取每个
```
references/ref_XX_asr.json
```
  文件。
- 提取原始转写文案、钩子、论证结构、情绪转折、点睛句和行动号召（CTA）。
- 在
```
source_reference_bundle.md
```
  中记录选中的参考视频、原始标题/链接、转写内容节选以及每个参考视频值得仿写的原因。
- 撰写
```
source_script.txt
```
  作为有依据的改写大纲：包含用户人设/定位、需融合的2-3个参考视频观点、转写内容中可复用的结构/钩子，以及明确的约束条件。请勿编造参考视频转写内容或用户需求中未提及的主张。
基于源素材改写脚本：
bash
```
luma-cli script rewrite --input source_script.txt --length short --output step1_rewrite.json
```
将改写后的文本保存为
```
transcript.txt
```
，供后续字幕环节使用（避免重复ASR）。

为改写后的文本生成语音：

bash

luma-cli --json tts --file transcript.txt --voice 男声3 --speech-rate 1.1 --output step2_tts.wav

--json

参数会输出

audio_object_key

，可直接传递给唇形同步命令，避免重复上传。

生成数字人视频（使用

--audio-key

直接引用云端音频）：

bash

luma-cli lipsync --avatar 数字人男 --audio-key <audio_object_key> --random-start --output step3_lipsync.mp4

如果省略

--audio-key

，唇形同步会优先使用项目的

latest_tts_key

，再回退到

--audio

文件上传方式。

文本分段并构建场景单元：

bash

luma-cli subtitle transcript.txt --text --segments-output step4_segments.json --no-effects --no-highlight
luma-cli pip scene --segments step4_segments.json --output step4_scene_units.json

准备并匹配本地PIP素材：

bash

luma-cli material group describe vlm_ai --output step4_materials_enriched.json
luma-cli pip match --scenes step4_scene_units.json --materials step4_materials_enriched.json --mode auto --output step4_material_matches.json

规划并渲染PIP画面：

bash

luma-cli pip plan --segments step4_segments.json --materials step4_materials_enriched.json --match-mode auto --output step4_picture_in_picture_plan.json
luma-cli pip render step3_lipsync.mp4 --plan step4_picture_in_picture_plan.json --output step4_picture_in_picture.mp4

如果未匹配到合适的插入素材，直接使用

step3_lipsync.mp4

作为字幕输入。

添加字幕（使用

--transcript

参数跳过ASR，因为已有准确脚本）：

bash

luma-cli subtitle step4_picture_in_picture.mp4 --transcript transcript.txt --output step5_subtitle.mp4

添加BGM：

bash

luma-cli bgm mix step5_subtitle.mp4 --output step6_bgm.mp4

生成封面：
bash
```
luma-cli cover generate step4_picture_in_picture.mp4 --title "<cover_title>" --subtitle "<cover_subtitle>" --count 12 --output-dir step7_covers
```
封面源文件规则：使用未添加内嵌字幕和BGM的干净视频画面。优先选择
```
step4_picture_in_picture.mp4
```
；若跳过了PIP环节，则使用
```
step3_lipsync.mp4
```
。切勿使用
```
step5_subtitle.mp4
```
或
```
step6_bgm.mp4
```
作为封面源文件，因为内嵌字幕会成为封面背景的一部分。

Agent Rules

智能体规则

Keep every intermediate file; do not collapse the flow into one hidden step.
```
source_script.txt
```
is not the final script and not AI-written from memory. It is the grounded rewrite brief produced from downloaded reference transcripts plus the user's persona/angle.
If
```
step0_content_research.json
```
only has titles/links and no transcript, do not proceed to rewrite until reference videos have been downloaded and ASR has produced usable text.
ASR is an expensive validation step, not a bulk-processing step. Never ASR more than 3 reference videos for one remix unless the user explicitly approves it.
Use the rewritten script as the single source for TTS, segmentation, subtitles, and cover text extraction.
Covers must use clean visual frames, not videos that already contain burned subtitles.
If the material library does not fit the script, skip PIP instead of forcing weak matches.
Use
```
project artifact list
```
before resuming or rerunning a partial workflow.
Report exact output paths for all generated files.

保留所有中间文件，请勿将工作流合并为单个隐藏步骤。
```
source_script.txt
```
并非最终脚本，也不是AI凭记忆生成的内容。它是基于下载的参考视频转写内容加上用户人设/角度生成的有依据的改写大纲。
如果
```
step0_content_research.json
```
仅包含标题/链接而无转写内容，请勿进入改写环节，需先下载参考视频并通过ASR获取可用文本。
ASR是成本较高的验证环节，而非批量处理环节。除非用户明确批准，否则一次仿写任务中对参考视频进行ASR的数量不得超过3个。
将改写后的脚本作为TTS、文本分段、字幕和封面文本提取的唯一数据源。
封面必须使用干净的视频帧，而非已包含内嵌字幕的视频。
如果素材库与脚本不匹配，跳过PIP环节，切勿强行匹配质量不佳的素材。
在恢复或重新运行部分工作流前，使用
```
project artifact list
```
命令查看产物。
报告所有生成文件的准确输出路径。