voice-batch-runner
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVoice Batch Runner
Voice Batch Runner
Follow shared release-shell rules in:
- release-shell rules
postplus-shared
Use this skill after persona, concept, and script work already exists.
This skill is for:
- designing an initial voice from persona traits
- generating script-specific audio takes
- storing reusable voice profiles for later videos
- preparing for future voice-identity capture or timbre-preserving generation
This skill is not for unconstrained voice casting.
遵循以下共享的release-shell规则:
- release-shell规则
postplus-shared
请在角色(persona)、概念和脚本工作完成后使用此技能。
此技能适用于:
- 根据角色特征设计初始语音
- 生成脚本专属音频片段
- 存储可复用的语音配置文件,供后续视频使用
- 为未来的语音身份捕获或音色保留生成做准备
此技能不适用于无限制的语音选角。
Core Idea
核心理念
Voice should be treated as a first-class persona asset, not a one-off byproduct of one script.
That means the system should separate:
voice profile- how this persona should sound
voice identity- the reusable voice source or captured timbre, if available
voice take- one concrete audio file generated for one script
The script can change every time. The persona voice should remain stable.
语音应被视为角色的核心资产,而非单支脚本的一次性副产品。
这意味着系统应区分以下三类对象:
- (语音配置文件)
voice profile- 定义该角色的语音风格
- (语音身份)
voice identity- 可复用的语音源或已捕获的音色(若存在)
- (语音片段)
voice take- 为单支脚本生成的具体音频文件
脚本可以随时更改,但角色的语音应保持稳定。
Hosted Boundary Rule
托管边界规则
- keep request files, raw provider responses, and run manifests under
when they are internal execution state
<work-folder>/.postplus/voice-batch-runner/ - keep only final user-facing audio exports outside
.postplus/ - if hosted voice capability is unavailable, unauthorized, or returns a stable network error, stop immediately instead of switching to ad hoc shell glue
- 若请求文件、原始服务商响应和运行清单属于内部执行状态,请将其存储在 目录下
<work-folder>/.postplus/voice-batch-runner/ - 仅将最终面向用户的音频导出文件存储在 目录外
.postplus/ - 若托管语音功能不可用、未获授权或持续返回网络错误,请立即停止操作,不要切换到临时shell脚本
Skill Family Direction
技能家族发展方向
This skill is the first member of a future voice skill family.
The family can naturally expand into:
voice-batch-runner- current skill; orchestrates voice generation and persistence
voice-identity-capture- future skill; captures or normalizes a reusable voice identity from approved reference audio
voice-review- future skill; audits realism, pacing, and persona fit
For now, keep everything in , but design the data model so these can split later.
voice-batch-runner此技能是未来语音技能家族的首个成员。
该家族可自然扩展为以下技能:
voice-batch-runner- 当前技能;协调语音生成与持久化
voice-identity-capture- 未来技能;从已批准的参考音频中捕获或标准化可复用的语音身份
voice-review- 未来技能;审核语音的真实感、语速及角色贴合度
目前所有功能均保留在 中,但需设计数据模型以便后续拆分。
voice-batch-runnerFact Rule
事实规则
Voice generation should be grounded in persona and content evidence.
Required upstream inputs:
- approved persona registry
- script text
- persona voice baseline
- video purpose or lane if it changes delivery style
Do not let the TTS model invent:
- a totally different age or authority level
- ad-like delivery when the persona is a work-friend creator
- high-drama acting not supported by benchmark tone
语音生成应基于角色和内容依据。
必需的上游输入:
- 已批准的角色注册表
- 脚本文本
- 角色语音基准
- 视频用途或定位(若会影响语音交付风格)
禁止TTS模型生成以下内容:
- 与角色完全不符的年龄或权威感
- 当角色为职场好友型创作者时,生成广告式的语音风格
- 基准语气不支持的夸张戏剧化演绎
Source Selection Rule
源选择规则
Use persona and script inputs from the active project context.
If the task clearly belongs to one client or campaign folder, read from that context first.
Do not assume one client directory is the default source base for all voice work.
使用当前项目上下文里的角色和脚本输入。
若任务明确属于某一客户或活动文件夹,请优先从该上下文读取数据。
不要假设某一客户目录是所有语音工作的默认源基础。
Voice Objects
语音对象
This workflow should distinguish three object types.
此工作流应区分三种对象类型。
1. Voice Profile
1. Voice Profile(语音配置文件)
The durable description of how the persona should sound.
Should include:
voiceProfileIdpersonaIdstylepacetonelanguageforbiddenTraitssourceBasis
对角色语音风格的持久化描述。
应包含:
voiceProfileIdpersonaIdstylepacetonelanguageforbiddenTraitssourceBasis
2. Voice Identity
2. Voice Identity(语音身份)
An optional reusable voice source.
This may later point to:
- a provider voice id
- a designed seed voice
- a captured timbre from approved reference audio
Should include:
voiceIdentityIdvoiceProfileIdprovider- or equivalent
providerVoiceId referenceAudioPathsstatus
可选的可复用语音源。
未来可指向:
- 服务商语音ID
- 设计的种子语音
- 从已批准参考音频中捕获的音色
应包含:
voiceIdentityIdvoiceProfileIdprovider- or equivalent
providerVoiceId referenceAudioPathsstatus
3. Voice Take
3. Voice Take(语音片段)
One concrete generated audio output for one script.
Should include:
voiceTakeIdvoiceProfileId- if used
voiceIdentityId - or source path
scriptId audioPathrequestPathresponsePathmanifestPathreviewStatus
为单支脚本生成的具体音频输出。
应包含:
voiceTakeIdvoiceProfileId- if used
voiceIdentityId - or source path
scriptId audioPathrequestPathresponsePathmanifestPathreviewStatus
Default Workflow
默认工作流
1. Start from persona registry
1. 从角色注册表开始
Before generating audio, confirm the persona registry contains:
- voice baseline
- approved image anchor
- intended use cases
If voice baseline is missing, write it first.
生成音频前,请确认角色注册表包含以下内容:
- 语音基准
- 已批准的形象锚点
- 预期使用场景
若缺少语音基准,请先创建。
2. Create or refine the voice profile
2. 创建或优化语音配置文件
Translate persona traits into a provider-ready voice description.
Example dimensions:
- calm vs energetic
- practical vs polished
- lightly nerdy vs polished professional
- medium pace vs brisk pace
- friendly and efficient vs authoritative
将角色特征转换为服务商可识别的语音描述。
示例维度:
- 沉稳 vs 活力
- 务实 vs 优雅
- 略带书呆子气 vs 专业干练
- 中等语速 vs 快速语速
- 友好高效 vs 权威正式
3. Generate an initial voice design
3. 生成初始语音设计
Use a voice-design model to generate a reference voice or first take from:
textvoice_descriptionlanguage
This first result should be reviewed before being treated as reusable.
使用语音设计模型,根据以下内容生成参考语音或首个语音片段:
textvoice_descriptionlanguage
首个结果需经过审核后才能被视为可复用资源。
4. Generate script-specific voice takes
4. 生成脚本专属语音片段
Once a voice profile or voice identity exists:
- keep the voice stable
- swap in a new script text
- generate a new take for each new video
The text changes. The voice continuity should not.
一旦语音配置文件或语音身份存在:
- 保持语音稳定
- 替换为新的脚本文本
- 为每个新视频生成新的语音片段
文本可以更改,但语音连贯性必须保持。
5. Review and iterate
5. 审核与迭代
Voice assets need structured review, not vague opinions.
Common review categories:
voice_too_salesyvoice_too_slowvoice_too_fastvoice_not_young_enoughvoice_not_professional_enoughvoice_too_flatvoice_too_broadcastvoice_persona_drift
语音资产需要结构化审核,而非模糊的评价。
常见审核类别:
voice_too_salesyvoice_too_slowvoice_too_fastvoice_not_young_enoughvoice_not_professional_enoughvoice_too_flatvoice_too_broadcastvoice_persona_drift
Path Selection Rule
路径选择规则
Store outputs under the active project's voice asset structure when one already exists.
If no such structure exists yet, use a clear workspace output path and state where files were written.
If the output will become a durable client asset, prefer confirming the destination with the user.
若当前项目已有语音资产结构,请将输出存储在该结构下。
若尚无此类结构,请使用清晰的工作区输出路径,并记录文件存储位置。
若输出将成为客户的持久资产,请优先与用户确认存储目标位置。
Example Persistence Convention
持久化约定示例
One possible project-local layout is:
text
voices/<voice-take-id>/
request.json
response.json
manifest.json
audio/
review.jsonKeep internal request files, raw provider responses, and run manifests under
when they are execution
artifacts rather than the final handoff.
<work-folder>/.postplus/voice-batch-runner/一种可行的项目本地目录结构如下:
text
voices/<voice-take-id>/
request.json
response.json
manifest.json
audio/
review.json若内部请求文件、原始服务商响应和运行清单属于执行产物而非最终交付物,请将其存储在 目录下。
<work-folder>/.postplus/voice-batch-runner/Tool Contract
工具契约
This skill expects these tool adapters:
design_voiceclone_voice_take
clone_voice_takereferenceAudioPathFuture extension:
capture_voice_identity
See .
references/tool-contracts.md此技能需要以下工具适配器:
design_voiceclone_voice_take
clone_voice_takereferenceAudioPath未来扩展:
capture_voice_identity
详见 。
references/tool-contracts.mdCore Scripts
核心脚本
scripts/design_voice.mjsscripts/clone_voice_take.mjs
These scripts take normalized request JSON files and write:
request.jsonresponse.jsonmanifest.jsonreview.json- downloaded audio under
audio/
scripts/design_voice.mjsscripts/clone_voice_take.mjs
这些脚本接受标准化的请求JSON文件,并生成以下文件:
request.jsonresponse.jsonmanifest.jsonreview.json- 下载的音频文件存储在 目录下
audio/
Current Provider Direction
当前服务商方向
First likely provider path:
- hosted voice design capability
Use it for initial voice design or first-pass takes.
Also relevant:
- hosted voice clone capability
Use voice clone when:
- you already have an approved reference audio for a persona
- later scripts need new text but should preserve the same timbre and speaking style
- you can provide the reference transcript for better matching
This fits the future requirement of "script changes, persona voice stays stable" better than voice-design alone.
Read the provider notes before implementing:
references/hosted-tts-voice-design.mdreferences/hosted-tts-voice-clone.md
Future provider path:
- a second model that preserves an approved voice timbre while reading new text
That future step should not change the outer workflow. It should only swap the tool adapter or voice identity backend.
首选服务商路径:
- 托管语音设计功能
用于初始语音设计或首轮语音片段生成。
相关功能:
- 托管语音克隆功能
在以下场景使用语音克隆:
- 已拥有某角色的批准参考音频
- 后续脚本需要新文本,但需保留相同的音色和说话风格
- 可提供参考音频的转录文本以提升匹配度
相比单纯的语音设计,此功能更符合「脚本更改,角色语音保持稳定」的未来需求。
实施前请阅读服务商说明:
references/hosted-tts-voice-design.mdreferences/hosted-tts-voice-clone.md
未来服务商路径:
- 第二种模型:读取新文本时保留已批准的语音音色
未来的这一步不应改变外部工作流,仅需替换工具适配器或语音身份后端即可。
Review Rule
审核规则
Before generating a take, verify:
- persona registry exists
- voice baseline exists
- script text is finalized enough for review
- output path is explicit
After generating a take, review:
- realism
- persona fit
- pacing
- whether it sounds too much like an ad
- whether it is reusable across many scripts
When reviewing cloned voice output, also check:
- how well it preserves the target timbre
- whether accent and speaking style drift from the reference
- whether the reference audio quality is limiting the result
生成语音片段前,请验证:
- 角色注册表已存在
- 语音基准已存在
- 脚本文本已足够完善可用于审核
- 输出路径明确
生成语音片段后,请审核:
- 真实感
- 角色贴合度
- 语速
- 是否过于像广告风格
- 是否可在多支脚本中复用
审核克隆语音输出时,还需检查:
- 目标音色的保留程度
- 口音和说话风格是否偏离参考音频
- 参考音频质量是否限制了输出结果
Example Commands
示例命令
Design an initial persona-aligned voice:
bash
node ${CLAUDE_SKILL_DIR}/scripts/design_voice.mjs \
--request /path/to/request.jsonGenerate a new take from approved reference audio:
bash
node ${CLAUDE_SKILL_DIR}/scripts/clone_voice_take.mjs \
--request /path/to/request.json设计贴合角色的初始语音:
bash
node ${CLAUDE_SKILL_DIR}/scripts/design_voice.mjs \
--request /path/to/request.json从已批准的参考音频生成新的语音片段:
bash
node ${CLAUDE_SKILL_DIR}/scripts/clone_voice_take.mjs \
--request /path/to/request.jsonFailure Mode
故障处理
Stop and state the gap if:
- no persona registry exists
- no voice baseline exists
- the script is still too unstable
- the request does not specify whether this is voice design or a script-specific take
Do not solve missing voice strategy by randomly changing the TTS description.
出现以下情况时,请停止操作并说明缺失项:
- 角色注册表不存在
- 语音基准不存在
- 脚本仍过于不稳定
- 请求未明确是语音设计还是脚本专属语音片段生成
不要通过随机修改TTS描述来解决语音策略缺失的问题。