Loading...
Loading...
Generate Chinese / Japanese speech with StepFun's stepaudio-2.5-tts — Contextual TTS that replaces step-tts-2's `voice_label` with natural-language `instruction` (≤200 chars) plus inline `()` parentheses for句内 prosody. Use when the user wants emotional / prosody control over voice synthesis (whisper, pause, stress, mood pivot mid-sentence), batch-generates game / app voice lines, migrates from `step-tts-2` (the `voice_label → instruction` breaking change), or hits StepFun's stricter 2.5-era censorship (死/消失/political terms). Triggers on 阶跃 TTS, StepAudio 合成, 语音合成, 配音, 文本转语音, TTS 升级, 迁移 step-tts-2. For transcription with the sibling stepaudio-2.5-asr model, use the stepfun-asr skill instead.
npx skill4agent add daymade/claude-code-skills stepfun-ttsstepaudio-2.5-ttsCompanion: for transcription with(the sibling model), use thestepaudio-2.5-asrskill — they share an API key but live on different endpoints with different body shapes.stepfun-asr
stepaudio-2.5-ttsvoice_labelinstruction()censorship_blockreferences/migration_from_v2.md$STEPFUN_API_KEY${CLAUDE_PLUGIN_DATA}/config.jsonmkdir -p "${CLAUDE_PLUGIN_DATA}" && cat > "${CLAUDE_PLUGIN_DATA}/config.json" <<EOF
{"api_key": "<paste key here>"}
EOF| User wants... | Script | Key detail |
|---|---|---|
| Synthesize 1–500 char Chinese with emotion | | Use |
| Synthesize long text (500–1000 char) | | 1000 char is the hard cap; split at semantic boundaries above that |
| Batch-generate game/app voice lines | | Handle |
| A/B compare two TTS models | | Compares duration/size across two directories |
Migrate from | see | |
python3 scripts/tts_generate.py --text "你好" --out /tmp/hello.mp3 --instruction "温暖的希望感"step-tts-2stepaudio-2.5-ttsreferences/migration_from_v2.mdINSTRUCTION_MAPstepaudio-2.5-ttsinstructioninstruction: "克制的悲伤,语气低沉柔弱,像快要消失一样"()inputinput: "(试探着问)你好吗?(开心地)太好了!(突然沉下来)不过...我快要消失了。"instruction: "活泼俏皮,像是在撒娇,带点嘴硬"instruction: "耳语声,气声很重,几乎听不清"input: "你好(停顿一下)我是蕾格(轻声)今天(加重)的天气真不错。"stepaudio-2.5-ttsvoice_labelvoice_label is not supported for v2 models| Error response | Actual cause | Fix |
|---|---|---|
| Sent | Remove |
| Sensitive word (死 / 消失 / etc.) | Rewrite the phrase OR fall back to |
| Silent audio truncation (input > 1000 chars) | Hard cap exceeded | Split at semantic boundaries; don't truncate mid-sentence |
references/known_issues.mdreferences/api_reference.md/v1/audio/speechreferences/migration_from_v2.mdstepaudio-2.5-ttsreferences/known_issues.mdvoice/zh_v25/censorship_blockinstruction()voice_labelstepaudio-2.5-tts