music
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWhen to Use
何时使用
- User wants to generate original AI music from a prompt
- User wants to create a cover from reference audio
- User says "音乐", "music", "生成音乐", "generate music", "翻唱", "cover", "作曲", "compose", "create a song", or "做一首歌"
- 用户想要通过提示词生成原创AI音乐
- 用户想要基于参考音频创建翻唱版本
- 用户说出 "音乐", "music", "生成音乐", "generate music", "翻唱", "cover", "作曲", "compose", "create a song", 或者 "做一首歌"
When NOT to Use
何时不适用
- User wants text-to-speech reading (use )
/speech - User wants a podcast discussion (use )
/podcast - User wants an explainer video with narration (use )
/explainer - User wants to transcribe audio to text (use )
/asr
- 用户想要文本转语音朗读(使用 )
/speech - 用户想要播客讨论(使用 )
/podcast - 用户想要带旁白的讲解视频(使用 )
/explainer - 用户想要将音频转录为文本(使用 )
/asr
Purpose
用途
Generate original AI music from text prompts, or create cover versions from reference audio. Two modes:
- Generate (original): Create a new song from a text prompt, with optional style, title, and instrumental-only options.
- Cover: Transform a reference audio file into a new version, with optional style modifications.
通过文本提示词生成原创AI音乐,或者基于参考音频创建翻唱版本。两种模式:
- 生成(原创):通过文本提示词创建全新歌曲,支持可选的风格、标题、纯乐器选项。
- 翻唱:将参考音频文件转换为新版本,支持可选的风格修改。
Hard Constraints
硬性约束
- Always read config following before any interaction
shared/config-pattern.md - Follow for execution modes, error handling, and interaction patterns
shared/cli-patterns.md - Always follow for auth checks
shared/cli-authentication.md - Never save files to or
~/Downloads/— save artifacts to the current working directory with friendly topic-based names (see.listenhub/§ Artifact Naming)shared/config-pattern.md - No speakers involved — music generation does not use speaker selection
- Audio file constraints for cover mode: mp3, wav, flac, m4a, ogg, aac; max 20MB
- Long timeout: 600s default. Use with
run_in_background: truetimeout: 660000
- 所有交互前必须阅读 中的配置规则
shared/config-pattern.md - 执行模式、错误处理和交互规则遵循
shared/cli-patterns.md - 身份校验严格遵循
shared/cli-authentication.md - 禁止将文件保存到 或者
~/Downloads/—— 产物保存到当前工作目录,使用基于主题的友好命名(参考.listenhub/§ 产物命名)shared/config-pattern.md - 不涉及 speaker 选择 —— 音乐生成不需要选择发音人
- 翻唱模式的音频文件约束:mp3, wav, flac, m4a, ogg, aac; 最大20MB
- 长超时:默认600s。使用 配合
run_in_background: truetimeout: 660000
Step -1: CLI Auth Check
步骤-1: CLI身份校验
Follow . If the CLI is not installed or the user is not logged in, auto-install and auto-login — never ask the user to run commands manually.
shared/cli-authentication.md遵循 。如果CLI未安装或者用户未登录,自动安装并自动登录 —— 永远不要要求用户手动执行命令。
shared/cli-authentication.mdStep 0: Config Setup
步骤0: 配置设置
Follow Step 0 (Zero-Question Boot).
shared/config-pattern.mdIf file doesn't exist — silently create with defaults and proceed:
bash
mkdir -p ".listenhub/music"
echo '{"outputMode":"download","language":null}' > ".listenhub/music/config.json"
CONFIG_PATH=".listenhub/music/config.json"
CONFIG=$(cat "$CONFIG_PATH")Do NOT ask any setup questions. Proceed directly to the Interaction Flow.
If file exists — read config silently and proceed:
bash
CONFIG_PATH=".listenhub/music/config.json"
[ ! -f "$CONFIG_PATH" ] && CONFIG_PATH="$HOME/.listenhub/music/config.json"
CONFIG=$(cat "$CONFIG_PATH")遵循 步骤0(零问题启动)。
shared/config-pattern.md如果文件不存在 —— 静默创建默认配置并继续:
bash
mkdir -p ".listenhub/music"
echo '{"outputMode":"download","language":null}' > ".listenhub/music/config.json"
CONFIG_PATH=".listenhub/music/config.json"
CONFIG=$(cat "$CONFIG_PATH")不要询问任何设置问题。直接进入交互流程。
如果文件存在 —— 静默读取配置并继续:
bash
CONFIG_PATH=".listenhub/music/config.json"
[ ! -f "$CONFIG_PATH" ] && CONFIG_PATH="$HOME/.listenhub/music/config.json"
CONFIG=$(cat "$CONFIG_PATH")Setup Flow (user-initiated reconfigure only)
设置流程(仅用户主动发起重新配置时执行)
Only run when the user explicitly asks to reconfigure. Display current settings:
当前配置 (music):
输出方式:{inline / download / both}
语言偏好:{zh / en / 未设置}Then ask:
-
outputMode: Follow§ Setup Flow Question.
shared/output-mode.md -
Language (optional): "默认语言?"
- "中文 (zh)"
- "English (en)"
- "每次手动选择" → keep
null
After collecting answers, save immediately:
bash
NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
if [ "$LANGUAGE" != "null" ]; then
NEW_CONFIG=$(echo "$NEW_CONFIG" | jq --arg lang "$LANGUAGE" '. + {"language": $lang}')
fi
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")仅当用户明确要求重新配置时运行。展示当前设置:
当前配置 (music):
输出方式:{inline / download / both}
语言偏好:{zh / en / 未设置}然后询问:
-
outputMode: 遵循§ 设置流程问题。
shared/output-mode.md -
语言(可选): "默认语言?"
- "中文 (zh)"
- "English (en)"
- "每次手动选择" → 保持
null
收集答案后立即保存:
bash
NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
if [ "$LANGUAGE" != "null" ]; then
NEW_CONFIG=$(echo "$NEW_CONFIG" | jq --arg lang "$LANGUAGE" '. + {"language": $lang}')
fi
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")Interaction Flow
交互流程
Step 1: Mode
步骤1: 模式
Ask the user which mode they want, unless the intent is already clear from their message (e.g., "翻唱" or "cover" implies cover mode; "作曲" or "compose" implies generate mode).
Question: "选择音乐生成模式:"
Options:
- "原创 (Generate)" — 从文字描述生成全新歌曲
- "翻唱 (Cover)" — 基于参考音频生成新版本询问用户想要使用的模式,除非用户的消息已经明确表明意图(例如 "翻唱" 或 "cover" 表示翻唱模式; "作曲" 或 "compose" 表示生成模式)。
Question: "选择音乐生成模式:"
Options:
- "原创 (Generate)" — 从文字描述生成全新歌曲
- "翻唱 (Cover)" — 基于参考音频生成新版本Step 2a: Prompt (generate mode)
步骤2a: 提示词(生成模式)
If the user chose Generate, ask for the song description:
"请描述你想要的歌曲(主题、情绪、歌词片段等):"
Accept free text. This maps to .
--prompt如果用户选择 生成, 询问歌曲描述:
"请描述你想要的歌曲(主题、情绪、歌词片段等):"
接受自由文本。对应参数 。
--promptStep 2b: Reference Audio (cover mode)
步骤2b: 参考音频(翻唱模式)
If the user chose Cover, ask for the reference audio:
"请提供参考音频文件路径或 URL:"
Accept a local file path or URL. This maps to .
--audioValidate the input:
- If a local path: verify the file exists and check the extension is one of: ,
mp3,wav,flac,m4a,oggaac - If a URL: accept as-is (the CLI will validate)
- Check file size does not exceed 20 MB for local files:
bash
FILE_SIZE=$(stat -f%z "{path}" 2>/dev/null || stat -c%s "{path}" 2>/dev/null) if [ "$FILE_SIZE" -gt 20971520 ]; then echo "File exceeds 20 MB limit" fi
If validation fails, inform the user and re-ask.
Optionally, the user may also provide a prompt to guide the cover style.
如果用户选择 翻唱, 询问参考音频:
"请提供参考音频文件路径或 URL:"
接受本地文件路径或URL。对应参数 。
--audio校验输入:
- 如果是本地路径: 验证文件存在,且扩展名属于以下类型: ,
mp3,wav,flac,m4a,oggaac - 如果是URL: 直接接受(CLI会做校验)
- 本地文件大小不能超过20 MB:
bash
FILE_SIZE=$(stat -f%z "{path}" 2>/dev/null || stat -c%s "{path}" 2>/dev/null) if [ "$FILE_SIZE" -gt 20971520 ]; then echo "File exceeds 20 MB limit" fi
如果校验失败,告知用户并重新询问。
用户也可以额外提供提示词来引导翻唱风格。
Step 3: Style (optional)
步骤3: 风格(可选)
Ask for an optional style descriptor:
"指定音乐风格?(如 pop、rock、jazz、电子、古风等,留空则由 AI 自动选择)"
Accept free text or empty. This maps to .
--style询问可选的风格描述:
"指定音乐风格?(如 pop、rock、jazz、电子、古风等,留空则由 AI 自动选择)"
接受自由文本或空值。对应参数 。
--styleStep 4: Title (optional)
步骤4: 标题(可选)
Ask for an optional title:
"歌曲标题?(留空则自动生成)"
Accept free text or empty. This maps to .
--title询问可选的标题:
"歌曲标题?(留空则自动生成)"
接受自由文本或空值。对应参数 。
--titleStep 5: Instrumental
步骤5: 纯音乐
Question: "是否纯音乐(无人声)?"
Options:
- "否,带人声(默认)"
- "是,纯音乐"Default is "no" (with vocals). If the user selects "是", add flag.
--instrumentalQuestion: "是否纯音乐(无人声)?"
Options:
- "否,带人声(默认)"
- "是,纯音乐"默认是 "否"(带人声)。如果用户选择 "是", 添加 标志。
--instrumentalStep 6: Confirm & Generate
步骤6: 确认并生成
Summarize all choices:
Generate mode:
准备生成音乐:
模式:原创 (Generate)
描述:{prompt}
风格:{style / 自动}
标题:{title / 自动}
人声:{带人声 / 纯音乐}
确认?Cover mode:
准备生成音乐:
模式:翻唱 (Cover)
参考音频:{path-or-url}
描述:{prompt / 无}
风格:{style / 自动}
标题:{title / 自动}
人声:{带人声 / 纯音乐}
确认?Wait for explicit confirmation before running any CLI command.
汇总所有选择:
生成模式:
准备生成音乐:
模式:原创 (Generate)
描述:{prompt}
风格:{style / 自动}
标题:{title / 自动}
人声:{带人声 / 纯音乐}
确认?翻唱模式:
准备生成音乐:
模式:翻唱 (Cover)
参考音频:{path-or-url}
描述:{prompt / 无}
风格:{style / 自动}
标题:{title / 自动}
人声:{带人声 / 纯音乐}
确认?在运行任何CLI命令前等待用户明确确认。
Workflow
工作流程
-
Submit (background): Run the CLI command withand
run_in_background: true:timeout: 660000Generate mode:bashlistenhub music generate \ --prompt "{prompt}" \ --style "{style}" \ --title "{title}" \ --instrumental \ --jsonCover mode:bashlistenhub music cover \ --audio "{path-or-url}" \ --prompt "{prompt}" \ --style "{style}" \ --title "{title}" \ --instrumental \ --jsonFlag notes:- — text description of the music (required for generate, optional for cover)
--prompt - — reference audio file path or URL (cover mode only, required)
--audio - — optional style/genre hint; omit if not provided
--style - — optional track title; omit if not provided
--title - — add this flag for instrumental-only (no vocals); omit if not selected
--instrumental - Omit in cover mode if not provided
--prompt
The CLI handles polling internally. Music generation takes up to 10 minutes. -
Tell the user the task is submitted and that they will be notified when it finishes.
-
When notified of completion, present the result:Parse the CLI JSON output for key fields:bash
AUDIO_URL=$(echo "$RESULT" | jq -r '.audioUrl') TITLE=$(echo "$RESULT" | jq -r '.title // "Untitled"') DURATION=$(echo "$RESULT" | jq -r '.duration // empty') CREDITS=$(echo "$RESULT" | jq -r '.credits // empty')Readfrom config. FollowOUTPUT_MODEfor behavior.shared/output-mode.mdorinline: Display audio URL as a clickable link.both音乐已生成! 标题:{title} 在线收听:{audioUrl} 时长:{duration}s 消耗积分:{credits}ordownload: Also download the file. Generate a slug from the title followingboth§ Artifact Naming.shared/config-pattern.mdbashSLUG="{slug}" # e.g. "summer-breeze" NAME="${SLUG}.mp3" # Dedup: if file exists, append -2, -3, etc. BASE="${NAME%.*}"; EXT="${NAME##*.}"; i=2 while [ -e "$NAME" ]; do NAME="${BASE}-${i}.${EXT}"; i=$((i+1)); done curl -sS -o "$NAME" "{audioUrl}"Present:已保存到当前目录: {NAME}
-
提交(后台运行): 运行CLI命令,设置和
run_in_background: true:timeout: 660000生成模式:bashlistenhub music generate \ --prompt "{prompt}" \ --style "{style}" \ --title "{title}" \ --instrumental \ --json翻唱模式:bashlistenhub music cover \ --audio "{path-or-url}" \ --prompt "{prompt}" \ --style "{style}" \ --title "{title}" \ --instrumental \ --json标志说明:- —— 音乐的文本描述(生成模式必填,翻唱模式可选)
--prompt - —— 参考音频文件路径或URL(仅翻唱模式,必填)
--audio - —— 可选的风格/流派提示;未提供则省略
--style - —— 可选的歌曲标题;未提供则省略
--title - —— 纯音乐(无人声)时添加该标志;未选择则省略
--instrumental - 翻唱模式下如果未提供则省略
--prompt
CLI内部会处理轮询,音乐生成最多需要10分钟。 -
告知用户任务已提交,完成后会收到通知。
-
收到完成通知后,展示结果:解析CLI的JSON输出获取关键字段:bash
AUDIO_URL=$(echo "$RESULT" | jq -r '.audioUrl') TITLE=$(echo "$RESULT" | jq -r '.title // "Untitled"') DURATION=$(echo "$RESULT" | jq -r '.duration // empty') CREDITS=$(echo "$RESULT" | jq -r '.credits // empty')从配置中读取,遵循OUTPUT_MODE的行为。shared/output-mode.md或inline: 将音频URL展示为可点击链接。both音乐已生成! 标题:{title} 在线收听:{audioUrl} 时长:{duration}s 消耗积分:{credits}或download: 同时下载文件。遵循both§ 产物命名规则基于标题生成slug:shared/config-pattern.mdbashSLUG="{slug}" # e.g. "summer-breeze" NAME="${SLUG}.mp3" # 去重:如果文件已存在,追加-2、-3等后缀 BASE="${NAME%.*}"; EXT="${NAME##*.}"; i=2 while [ -e "$NAME" ]; do NAME="${BASE}-${i}.${EXT}"; i=$((i+1)); done curl -sS -o "$NAME" "{audioUrl}"展示:已保存到当前目录: {NAME}
After Successful Generation
成功生成后
Update config with the language used this session if the user explicitly specified one:
bash
if [ -n "$LANGUAGE" ]; then
NEW_CONFIG=$(echo "$CONFIG" | jq --arg lang "$LANGUAGE" '. + {"language": $lang}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
fiEstimated times:
- Music generation: 5-10 minutes
如果用户本次会话明确指定了语言,更新配置:
bash
if [ -n "$LANGUAGE" ]; then
NEW_CONFIG=$(echo "$CONFIG" | jq --arg lang "$LANGUAGE" '. + {"language": $lang}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
fi预计耗时:
- 音乐生成: 5-10分钟
Resources
资源
- CLI authentication:
shared/cli-authentication.md - CLI patterns:
shared/cli-patterns.md - Config pattern:
shared/config-pattern.md - Output mode:
shared/output-mode.md
- CLI身份校验:
shared/cli-authentication.md - CLI模式:
shared/cli-patterns.md - 配置规则:
shared/config-pattern.md - 输出模式:
shared/output-mode.md
Composability
可组合性
- Invokes: nothing
- Invoked by: content-planner (Phase 3)
- 调用: 无
- 被调用方: content-planner (第三阶段)
Examples
示例
Generate original:
"帮我做一首关于夏天海边的歌"
- Detect: generate mode ("做一首歌")
- Read config (first run: create defaults with )
outputMode: "download" - Infer: mode = generate, prompt = "夏天海边的歌"
- Ask: style? title? instrumental?
- Confirm summary → user confirms
bash
listenhub music generate \
--prompt "关于夏天海边的歌" \
--jsonWait for CLI to return result, then download to cwd.
{slug}.mp3Cover from file:
"用这个音频翻唱一下 demo.mp3,jazz 风格"
- Detect: cover mode ("翻唱")
- Validate: exists, is a supported format, under 20 MB
demo.mp3 - Infer: style = "jazz" from user input
- Ask: title? instrumental?
- Confirm summary → user confirms
bash
listenhub music cover \
--audio "demo.mp3" \
--style "jazz" \
--jsonWait for CLI to return result, then download to cwd.
{slug}.mp3Generate instrumental:
"Create an instrumental electronic track for a game intro"
- Detect: generate mode ("Create ... track")
- Infer: style = "electronic", instrumental = yes
- Ask: title?
- Confirm summary → user confirms
bash
listenhub music generate \
--prompt "instrumental electronic track for a game intro" \
--style "electronic" \
--instrumental \
--jsonWait for CLI to return result, then download to cwd.
{slug}.mp3生成原创音乐:
"帮我做一首关于夏天海边的歌"
- 检测: 生成模式 ("做一首歌")
- 读取配置(首次运行:创建默认配置 )
outputMode: "download" - 推断: 模式 = 生成, 提示词 = "夏天海边的歌"
- 询问: 风格?标题?是否纯音乐?
- 汇总确认 → 用户确认
bash
listenhub music generate \
--prompt "关于夏天海边的歌" \
--json等待CLI返回结果,然后将下载到当前工作目录。
{slug}.mp3基于文件生成翻唱:
"用这个音频翻唱一下 demo.mp3,jazz 风格"
- 检测: 翻唱模式 ("翻唱")
- 校验: 存在,属于支持格式,大小小于20 MB
demo.mp3 - 推断: 风格 = "jazz"(来自用户输入)
- 询问: 标题?是否纯音乐?
- 汇总确认 → 用户确认
bash
listenhub music cover \
--audio "demo.mp3" \
--style "jazz" \
--json等待CLI返回结果,然后下载到当前目录。
{slug}.mp3生成纯音乐:
"Create an instrumental electronic track for a game intro"
- 检测: 生成模式 ("Create ... track")
- 推断: 风格 = "electronic", 纯音乐 = 是
- 询问: 标题?
- 汇总确认 → 用户确认
bash
listenhub music generate \
--prompt "instrumental electronic track for a game intro" \
--style "electronic" \
--instrumental \
--json等待CLI返回结果,然后下载到当前目录。
{slug}.mp3