audio-transcription

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Audio Transcription

音频转写

Follow shared release-shell rules in:

```
postplus-shared
```
release-shell rules

Use this skill when the input is audio and the main job is:

transcript generation
subtitle-ready timing
rough speech search
multilingual audio transcription

This skill is not for video semantic understanding.

遵循以下共享的release-shell规则：

```
postplus-shared
```
release-shell规则

当输入为音频且主要任务为以下内容时，使用此skill：

生成转写文本
生成可用于字幕的时间标记
粗略语音内容搜索
多语言音频转写

此skill不适用于视频语义理解。

Hosted Endpoint

托管端点

First-version hosted transcription endpoints:

hosted transcription capability
```
transcription-whisper
```
```
transcription-whisper-turbo
```

Use

transcription-whisper

by default when subtitle quality matters.

Use

transcription-whisper-turbo

when:

the user wants a cheaper rough pass
timestamps are not the primary requirement

第一版托管转写端点：

托管转写能力
```
transcription-whisper
```
```
transcription-whisper-turbo
```

当字幕质量至关重要时，默认使用

transcription-whisper

。

在以下场景使用

transcription-whisper-turbo

：

用户需要低成本的粗略转写结果
时间戳不是主要需求

Output Contract

输出约定

Persist:

```
request.json
```
```
response.json
```
```
manifest.json
```
downloaded provider outputs under
```
outputs/
```

Do not rely on the provider dashboard as the durable record.

持久化存储：

```
request.json
```
```
response.json
```
```
manifest.json
```
下载的服务商输出文件存放在
```
outputs/
```
目录下

不要依赖服务商仪表盘作为持久化记录。

Poll Behavior

轮询行为

Hosted transcription is asynchronous. The script polls the prediction result URL until status is

completed

failed

. Default poll window: 150 attempts × 2 s = 5 minutes.

Short audio clips typically complete in under 30 s. If a job exceeds 5 minutes, retry rather than increasing the timeout further.

托管转写是异步的。脚本会轮询预测结果URL，直到状态变为

completed

或

failed

。默认轮询窗口：150次尝试 × 2秒 = 5分钟。

短音频片段通常在30秒内完成。如果任务超过5分钟，请重试，不要进一步增加超时时间。

Default Workflow

默认工作流

Normalize the transcription request.
Submit to hosted Whisper capability.
Save raw request and response locally.
Poll if the job is asynchronous.
Save downloaded transcript artifacts locally.
Hand off to
```
subtitle-packager
```
if SRT/VTT is needed.

标准化转写请求。
提交至托管的Whisper能力服务。
将原始请求和响应保存到本地。
若任务为异步则进行轮询。
将下载的转写产物保存到本地。
如果需要SRT/VTT格式，移交至
```
subtitle-packager
```
处理。

Scripts

脚本

```
scripts/transcribe_audio.mjs
```
```
scripts/poll_transcription.mjs
```

```
scripts/transcribe_audio.mjs
```
```
scripts/poll_transcription.mjs
```

Read These References

参考文档

```
references/tool-contracts.md
```

```
references/tool-contracts.md
```

Release-Shell Execution Contract

Release-Shell执行约定

keep transcription requests, provider responses, manifests, and downloaded transcript artifacts under
```
<work-folder>/.postplus/audio-transcription/
```
keep only final user-facing transcript exports outside
```
.postplus/
```
start with a bounded first pass, usually one source file before larger batches
if hosted transcription capability is unavailable, unauthorized, or returns a stable network error, stop immediately instead of switching to ad hoc shell glue

将转写请求、服务商响应、清单文件以及下载的转写产物存放在
```
<work-folder>/.postplus/audio-transcription/
```
目录下
仅将最终面向用户的转写导出文件放在
```
.postplus/
```
目录外
先进行有限的首次尝试，通常先处理一个源文件，再处理更大批量的文件
如果托管转写能力不可用、未授权或返回稳定的网络错误，请立即停止，不要切换到临时的shell脚本拼接方案