Loading...
Loading...
Compare original and translation side by side
${CLAUDE_PLUGIN_DATA}/config.json${CLAUDE_PLUGIN_DATA}/config.jsoncat "${CLAUDE_PLUGIN_DATA}/config.json" 2>/dev/nullFirst-time setup for ASR transcription.
I need to know where your ASR service is running so I can send audio to it.
RECOMMENDATION: Use the defaults below if you have Qwen3-ASR on a 4090 via Tailscale.
Q1: ASR Endpoint URL?
A) http://workstation-4090-wsl:8002/v1/audio/transcriptions (Default — Qwen3-ASR vLLM via Tailscale)
B) http://localhost:8002/v1/audio/transcriptions (Local machine)
C) Let me enter a custom URL
Q2: Does your network have an HTTP proxy that might intercept LAN/Tailscale traffic?
A) Yes — add --noproxy to bypass it (Recommended if you use Shadowrocket/Clash/corporate proxy)
B) No — direct connection is finemkdir -p "${CLAUDE_PLUGIN_DATA}"
python3 -c "
import json
config = {
'endpoint': 'USER_PROVIDED_ENDPOINT',
'model': 'USER_PROVIDED_MODEL_OR_DEFAULT',
'noproxy': True, # or False based on user answer
'max_timeout': 900
}
with open('${CLAUDE_PLUGIN_DATA}/config.json', 'w') as f:
json.dump(config, f, indent=2)
print('Config saved.')
"cat "${CLAUDE_PLUGIN_DATA}/config.json" 2>/dev/nullASR转录首次设置。
我需要了解你的ASR服务运行位置,以便发送音频至该服务。
推荐方案:若你已通过Tailscale在4090显卡上部署Qwen3-ASR,可使用以下默认选项。
问题1:ASR端点URL?
A) http://workstation-4090-wsl:8002/v1/audio/transcriptions(默认选项——通过Tailscale部署的Qwen3-ASR vLLM)
B) http://localhost:8002/v1/audio/transcriptions(本地机器部署)
C) 让我输入自定义URL
问题2:你的网络是否存在可能拦截局域网/Tailscale流量的HTTP代理?
A) 是——添加--noproxy参数绕过代理(若使用Shadowrocket/Clash/企业代理,推荐此选项)
B) 否——直接连接即可mkdir -p "${CLAUDE_PLUGIN_DATA}"
python3 -c "
import json
config = {
'endpoint': 'USER_PROVIDED_ENDPOINT',
'model': 'USER_PROVIDED_MODEL_OR_DEFAULT',
'noproxy': True, # 或根据用户回答设为False
'max_timeout': 900
}
with open('${CLAUDE_PLUGIN_DATA}/config.json', 'w') as f:
json.dump(config, f, indent=2)
print('Config saved.')
"python3 -c "
import json, subprocess, sys
with open('${CLAUDE_PLUGIN_DATA}/config.json') as f:
cfg = json.load(f)
base = cfg['endpoint'].rsplit('/audio/', 1)[0]
noproxy = ['--noproxy', '*'] if cfg.get('noproxy', True) else []
result = subprocess.run(
['curl', '-s', '--max-time', '10'] + noproxy + [f'{base}/models'],
capture_output=True, text=True
)
if result.returncode != 0 or not result.stdout.strip():
print(f'HEALTH CHECK FAILED', file=sys.stderr)
print(f'Endpoint: {base}/models', file=sys.stderr)
print(f'stdout: {result.stdout[:200]}', file=sys.stderr)
print(f'stderr: {result.stderr[:200]}', file=sys.stderr)
sys.exit(1)
else:
print(f'Service healthy: {base}')
print(f'Model: {cfg[\"model\"]}')
"ASR service at [endpoint] is not responding.
Options:
A) Diagnose — check network, Tailscale, and service status step by step
B) Reconfigure — the endpoint URL might be wrong, let me re-enter it
C) Try anyway — send the transcription request and see what happens
D) Abort — I'll fix the service manually and come back laterping -c 1 HOSTtailscale status | grep HOSTtailscale ssh USER@HOST "curl -s localhost:PORT/v1/models"--noproxy '*'python3 -c "
import json, subprocess, sys
with open('${CLAUDE_PLUGIN_DATA}/config.json') as f:
cfg = json.load(f)
base = cfg['endpoint'].rsplit('/audio/', 1)[0]
noproxy = ['--noproxy', '*'] if cfg.get('noproxy', True) else []
result = subprocess.run(
['curl', '-s', '--max-time', '10'] + noproxy + [f'{base}/models'],
capture_output=True, text=True
)
if result.returncode != 0 or not result.stdout.strip():
print(f'HEALTH CHECK FAILED', file=sys.stderr)
print(f'Endpoint: {base}/models', file=sys.stderr)
print(f'stdout: {result.stdout[:200]}', file=sys.stderr)
print(f'stderr: {result.stderr[:200]}', file=sys.stderr)
sys.exit(1)
else:
print(f'Service healthy: {base}')
print(f'Model: {cfg[\"model\"]}')
"位于[端点地址]的ASR服务无响应。
可选操作:
A) 诊断——逐步检查网络、Tailscale和服务状态
B) 重新配置——端点URL可能有误,让我重新输入
C) 尝试直接发送——直接发送转录请求,看结果如何
D) 中止——我将手动修复服务后再回来ping -c 1 HOSTtailscale status | grep HOSTtailscale ssh USER@HOST "curl -s localhost:PORT/v1/models"--noproxy '*'ffmpeg -i INPUT_VIDEO -vn -acodec libmp3lame -q:a 4 -ar 16000 -ac 1 OUTPUT.mp3 -yffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 INPUT_FILEffmpeg -i INPUT_VIDEO -vn -acodec libmp3lame -q:a 4 -ar 16000 -ac 1 OUTPUT.mp3 -yffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 INPUT_FILEpython3 -c "
import json, subprocess, sys, os, tempfile
with open('${CLAUDE_PLUGIN_DATA}/config.json') as f:
cfg = json.load(f)
noproxy = ['--noproxy', '*'] if cfg.get('noproxy', True) else []
timeout = str(cfg.get('max_timeout', 900))
audio_file = 'AUDIO_FILE_PATH' # replace with actual path
output_json = tempfile.mktemp(suffix='.json', prefix='asr_')
result = subprocess.run(
['curl', '-s', '--max-time', timeout] + noproxy + [
cfg['endpoint'],
'-F', f'file=@{audio_file}',
'-F', f'model={cfg[\"model\"]}',
'-o', output_json
], capture_output=True, text=True
)
with open(output_json) as f:
data = json.load(f)
if 'text' not in data:
print(f'ERROR: {json.dumps(data)[:300]}', file=sys.stderr)
sys.exit(1)
text = data['text']
duration = data.get('usage', {}).get('seconds', 0)
print(f'Transcribed: {len(text)} chars, {duration}s audio', file=sys.stderr)
print(text)
os.unlink(output_json)
" > OUTPUT.txtpython3 -c "
import json, subprocess, sys, os, tempfile
with open('${CLAUDE_PLUGIN_DATA}/config.json') as f:
cfg = json.load(f)
noproxy = ['--noproxy', '*'] if cfg.get('noproxy', True) else []
timeout = str(cfg.get('max_timeout', 900))
audio_file = 'AUDIO_FILE_PATH' # 替换为实际路径
output_json = tempfile.mktemp(suffix='.json', prefix='asr_')
result = subprocess.run(
['curl', '-s', '--max-time', timeout] + noproxy + [
cfg['endpoint'],
'-F', f'file=@{audio_file}',
'-F', f'model={cfg[\"model\"]}',
'-o', output_json
], capture_output=True, text=True
)
with open(output_json) as f:
data = json.load(f)
if 'text' not in data:
print(f'ERROR: {json.dumps(data)[:300]}', file=sys.stderr)
sys.exit(1)
text = data['text']
duration = data.get('usage', {}).get('seconds', 0)
print(f'Transcribed: {len(text)} chars, {duration}s audio', file=sys.stderr)
print(text)
os.unlink(output_json)
" > OUTPUT.txttextTranscription may have an issue:
- Expected: ~[N] chars for [M] minutes of audio
- Got: [actual chars] chars
- Preview: "[first 100 chars...]"
Options:
A) Save as-is — the output looks fine to me
B) Retry with fallback — split into chunks and merge (handles long audio / OOM)
C) Reconfigure — try a different model or endpoint
D) Abort — something is wrong with the service.txttext转录可能存在问题:
- 预期:[M]分钟音频应生成约[N]个字符
- 实际得到:[实际字符数]个字符
- 预览:"[前100个字符...]"
可选操作:
A) 按原样保存——输出看起来没问题
B) 使用回退方案重试——分片后合并(处理长音频/内存不足问题)
C) 重新配置——尝试不同模型或端点
D) 中止——服务存在问题.txtpython3 ${CLAUDE_PLUGIN_ROOT}/scripts/overlap_merge_transcribe.py \
--config "${CLAUDE_PLUGIN_DATA}/config.json" \
INPUT_AUDIO OUTPUT.txtpython3 ${CLAUDE_PLUGIN_ROOT}/scripts/overlap_merge_transcribe.py \
--config "${CLAUDE_PLUGIN_DATA}/config.json" \
INPUT_AUDIO OUTPUT.txtrm "${CLAUDE_PLUGIN_DATA}/config.json"rm "${CLAUDE_PLUGIN_DATA}/config.json"