create-sound
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCreate Sound
创建音效
Generated frombyrules/*.md. Do not edit by hand.src/build.mjs
Pick a generation path with , then walk the matching section.
pipeline-detect-input由通过rules/*.md生成,请勿手动编辑。src/build.mjs
通过选择生成路径,然后执行对应章节的步骤。
pipeline-detect-input1. Generation Pipeline
1. 生成流程
Procedural steps the agent runs end-to-end. Start here when handling any create-sound request.
代理端执行的全流程步骤。处理任何创建音效请求时均从此处开始。
1.1 Detect input mode and route the request (CRITICAL)
1.1 检测输入模式并路由请求 (CRITICAL)
Decide which path to run based on what the user provided.
| Input | Path |
|---|---|
| Prompt only (no audio attachment) | Skip |
| Audio file only | Run all |
| Both prompt and audio | Run |
根据用户提供的内容决定执行哪条路径。
| 输入内容 | 路径 |
|---|---|
| 仅提示(无音频附件) | 跳过 |
| 仅音频文件 | 执行所有 |
| 同时提供提示和音频 | 先执行 |
Detecting audio
音频检测
Look for attached files matching , , , , or any path the user references that resolves to an audio file. A JSON manifest ( next to a sprite) is also an audio-path signal.
*.wav*.mp3*.flac*.ogg*.json查找匹配、、、格式的附件,或用户提及的可解析为音频文件的路径。sprite文件旁的JSON清单()也视为音频路径信号。
*.wav*.mp3*.flac*.ogg*.jsonRefinement examples (prompt + audio)
优化示例(提示+音频)
| Prompt qualifier | Refinement on measured definition |
|---|---|
| "warmer" | add |
| "shorter" / "punchier" | clamp |
| "brighter" | drop or raise any lowpass cutoff |
| "with reverb" | append |
| "lower octave" | halve |
| 提示限定词 | 对已解析定义的优化操作 |
|---|---|
| "warmer"(更温暖) | 添加 |
| "shorter" / "punchier"(更短促/更有冲击力) | 将 |
| "brighter"(更明亮) | 移除低通滤波器或提高其截止频率 |
| "with reverb"(添加混响) | 添加 |
| "lower octave"(低一个八度) | 将 |
Output of this step
本步骤输出
Produce an internal note like:
Input: prompt + audio
Plan: run interpret-* on out/click.wav, then refine with mood-warm.Then proceed to the next pipeline step.
生成类似以下的内部记录:
输入:提示+音频
计划:对out/click.wav执行interpret-*规则,然后用mood-warm进行优化。随后进入下一个流程步骤。
1.2 Pick a base layer from the prompt's event class (CRITICAL)
1.2 根据提示的事件类别选择基础层 (CRITICAL)
Tokenize the prompt and find the strongest event-class signal. Match against the rules.
event-*对提示进行分词,找到最匹配的事件类别信号,与规则进行匹配。
event-*Token map
分词映射
| Tokens in prompt | Event rule |
|---|---|
| click, tap, key, press, button | |
| tick, scroll, snap, focus | |
| success, complete, win, achievement, level-up, confetti | |
| error, fail, wrong, invalid, delete, destroy | |
| modal, dialog, popup, drawer, sheet, sidebar, dropdown, menu | |
| swoosh, slide, transition, page, tab | |
| notification, alert, ding, bell, mention, badge | |
| toggle, switch, on, off | |
| 提示中的分词 | 事件规则 |
|---|---|
| click, tap, key, press, button | |
| tick, scroll, snap, focus | |
| success, complete, win, achievement, level-up, confetti | |
| error, fail, wrong, invalid, delete, destroy | |
| modal, dialog, popup, drawer, sheet, sidebar, dropdown, menu | |
| swoosh, slide, transition, page, tab | |
| notification, alert, ding, bell, mention, badge | |
| toggle, switch, on, off | |
Direction tokens (open vs close)
方向分词(打开vs关闭)
- "open", "appear", "in", "show", "expand", "confirm" -> ascending pitch.
- "close", "dismiss", "out", "hide", "collapse", "cancel" -> descending pitch.
- "open", "appear", "in", "show", "expand", "confirm" -> 升调。
- "close", "dismiss", "out", "hide", "collapse", "cancel" -> 降调。
Output
输出
A starting literal copied from the chosen event rule's . The next step () will mutate it.
SoundDefinitionexamplepipeline-apply-moodIf no event class fires confidently, default to and let mood adjectives do the work.
event-click从所选事件规则的中复制起始字面量。下一步()将对其进行修改。
exampleSoundDefinitionpipeline-apply-mood如果无法确定匹配的事件类别,默认使用,并通过情绪形容词调整效果。
event-click1.3 Apply mood adjectives onto the base layer (HIGH)
1.3 将情绪形容词应用到基础层 (HIGH)
After produces a starting , scan the prompt for adjective tokens and apply each rule's mutation in order.
pipeline-pick-base-layerSoundDefinitionmood-*pipeline-pick-base-layerSoundDefinitionmood-*Order of application
应用顺序
- Source-shape adjectives (,
warm,bright,glassy,metallic,lofi,retro) - mutateorganic,source.type, or addsource.fm.filter - Envelope adjectives (,
punchy) - mutateairy/envelope.attack.envelope.decay - Effect adjectives (,
reverby,delayed) - append tocrushed.effects
- 声源形态形容词(、
warm、bright、glassy、metallic、lofi、retro)——修改organic、source.type或添加source.fm。filter - 包络形容词(、
punchy)——修改airy/envelope.attack。envelope.decay - 效果形容词(、
reverby、delayed)——添加到crushed数组。effects
Conflict resolution
冲突解决
- +
warm-> the later token wins.bright - +
lofi-> apply both, but capglassyat 2 entries.effects - +
punchy-> they're orthogonal (envelope vs source); both apply.airy
- +
warm-> 后出现的分词优先级更高。bright - +
lofi-> 同时应用,但glassy条目上限为2个。effects - +
punchy-> 两者互不冲突(包络vs声源),均应用。airy
Refinement on existing definition (audio + prompt path)
对已有定义的优化(音频+提示路径)
When the input mode is , treat each adjective as a refinement on the measured definition rather than from scratch:
prompt + audio| Adjective | Refinement |
|---|---|
| warmer | add or lower |
| brighter | remove lowpass or raise its cutoff above 6 kHz |
| punchier | clamp |
| longer | extend |
| crisper | raise |
当输入模式为时,将每个形容词视为对已解析定义的优化,而非从头生成:
提示+音频| 形容词 | 优化操作 |
|---|---|
| warmer | 添加或降低 |
| brighter | 移除低通滤波器或将其截止频率提高到6 kHz以上 |
| punchier | 将 |
| longer | 延长 |
| crisper | 小幅提高 |
Output
输出
A mutated . Hand off to .
SoundDefinitionpipeline-decide-layering生成修改后的,传递给。
SoundDefinitionpipeline-decide-layering1.4 Decide single-layer vs multi-layer (MEDIUM-HIGH)
1.4 决定单层或多层结构 (MEDIUM-HIGH)
| Event class | Default |
|---|---|
| click, tap, tick, hover, focus, swoosh | 1 layer ( |
| toggle, copy, send, sync | 2 layers (paired pitches with |
| success, complete, level-up, confetti | 3+ layers (chord with cascading |
| error, delete | 2 layers ( |
See , , , for the concrete shapes.
layer-singlelayer-octave-pairlayer-ascending-chordlayer-click-plus-body| 事件类别 | 默认结构 |
|---|---|
| click, tap, tick, hover, focus, swoosh | 1层( |
| toggle, copy, send, sync | 2层(带 |
| success, complete, level-up, confetti | 3+层(带级联 |
| error, delete | 2层( |
具体结构可参考、、、。
layer-singlelayer-octave-pairlayer-ascending-chordlayer-click-plus-bodyPromoting a single Layer to MultiLayerSound
将单层Layer升级为MultiLayerSound
If the prompt or refinement requires more than one layer, wrap:
ts
{
layers: [<existing layer>, <new layer>],
// optional global effects, e.g. sidechain compressor, master EQ
}Per-layer values should sum to no more than ~0.6 (see ).
gainvalidate-gain-budget如果提示或优化需要多层结构,可进行包裹:
ts
{
layers: [<existing layer>, <new layer>],
// 可选全局效果,例如侧链压缩器、主EQ
}每层的值总和不应超过约0.6(参考)。
gainvalidate-gain-budgetDemoting MultiLayerSound to a single Layer
将MultiLayerSound降级为单层Layer
If only one layer survives mood application, emit the inner directly rather than a one-element . Both validate, but the single-layer form is the canonical compact shape.
LayerMultiLayerSound如果应用情绪规则后仅剩下一层,直接输出内部的而非单元素。两种格式均有效,但单层形式是标准紧凑结构。
LayerMultiLayerSound1.5 Emit, optionally render, optionally round-trip (HIGH)
1.5 输出、可选渲染、可选往返验证 (HIGH)
1. Emit
1. 输出
Always return a TypeScript snippet ready to paste into a file:
.web-kits/<patch>.tsts
import type { SoundDefinition } from "@web-kits/audio";
export const myClick: SoundDefinition = {
source: { type: "sine", frequency: 1300, fm: { ratio: 0.5, depth: 60 } },
envelope: { decay: 0.012, release: 0.004 },
gain: 0.18,
};Plus a one-line rationale that names the prompt tokens you acted on:
"click" -> base from; "warm" -> kept default sine, no extra filter needed at 1.3 kHz.event-click
始终返回可直接粘贴到文件的TypeScript代码片段:
.web-kits/<patch>.tsts
import type { SoundDefinition } from "@web-kits/audio";
export const myClick: SoundDefinition = {
source: { type: "sine", frequency: 1300, fm: { ratio: 0.5, depth: 60 } },
envelope: { decay: 0.012, release: 0.004 },
gain: 0.18,
};同时添加一行说明,列出你依据的提示分词:
"click" -> 基于生成基础结构;"warm" -> 保留默认正弦波,1.3 kHz下无需额外滤波器。event-click
2. Optional preview render
2. 可选预览渲染
If the user asked for a WAV (or you want to grade your own output), use :
packages/audio/src/offline.tsts
import { renderToWav } from "@web-kits/audio";
import { writeFile } from "node:fs/promises";
const blob = await renderToWav(myClick, { duration: 0.3 });
await writeFile("preview.wav", Buffer.from(await blob.arrayBuffer()));durationattack + decay + release + 0.05如果用户要求生成WAV(或你需要验证输出效果),使用:
packages/audio/src/offline.tsts
import { renderToWav } from "@web-kits/audio";
import { writeFile } from "node:fs/promises";
const blob = await renderToWav(myClick, { duration: 0.3 });
await writeFile("preview.wav", Buffer.from(await blob.arrayBuffer()));durationattack + decay + release + 0.053. Optional round-trip validation
3. 可选往返验证
If you generated from a prompt and want to confirm the result matches intent, run the rules against the rendered WAV and diff measured vs intended values:
interpret-*| Field | Acceptable drift |
|---|---|
| Fundamental Hz | ±5% |
| Attack | ±2 ms |
| Decay | ±10% |
| Spectral centroid | ±20% of expected for the chosen waveform |
If drift exceeds tolerance, refine the definition (often by raising/lowering , tightening , or adjusting ) and render again.
gainenvelopefilter.frequency如果根据提示生成了音效,想要确认结果符合预期,可对渲染后的WAV执行规则,对比解析值与预期值:
interpret-*| 字段 | 可接受偏差范围 |
|---|---|
| 基频Hz | ±5% |
| Attack(起音) | ±2 ms |
| Decay(衰减) | ±10% |
| 频谱重心 | 所选波形预期值的±20% |
如果偏差超出容忍范围,优化定义(通常是调整、收紧或修改)并重新渲染。
gainenvelopefilter.frequency2. Audio Interpretation
2. 音频解析
FFT analysis sub-steps that fire when the user shares an audio file.
当用户分享音频文件时触发的FFT分析子步骤。
2.1 Acquire and split source audio (HIGH)
2.1 获取并拆分源音频 (HIGH)
The user shared a single file or a sprite (one file containing many sounds). Before any FFT work, get one mono WAV per sound on disk.
用户分享了单个文件或sprite(包含多个音效的单个文件)。在进行FFT分析前,先将每个音效保存为磁盘上的单声道WAV文件。
Sprite from an npm package
来自npm包的Sprite
bash
npm pack <package-name> --pack-destination /tmp
tar -xzf /tmp/<package-name>-*.tgz -C /tmpLook for the MP3/WAV plus any JSON manifest mapping sound names to time offsets.
bash
npm pack <package-name> --pack-destination /tmp
tar -xzf /tmp/<package-name>-*.tgz -C /tmp查找MP3/WAV文件及对应的JSON清单(映射音效名称到时间偏移量)。
Manifest-driven slicing
基于清单的切片
bash
ffmpeg -i sprite.mp3 \
-ss <start_seconds> -t <duration_seconds> \
-acodec pcm_s16le -ar 44100 \
output/<name>.wavbash
ffmpeg -i sprite.mp3 \
-ss <start_seconds> -t <duration_seconds> \
-acodec pcm_s16le -ar 44100 \
output/<name>.wavSilence-detection slicing (no manifest)
基于静音检测的切片(无清单)
bash
ffmpeg -i sprite.mp3 -af silencedetect=noise=-40dB:d=0.05 -f null -Read the / lines and slice between gaps.
silence_startsilence_endbash
ffmpeg -i sprite.mp3 -af silencedetect=noise=-40dB:d=0.05 -f null -读取/行,在间隙处进行切片。
silence_startsilence_endOutput convention
输出约定
Per-sound WAVs go in (mono, 44.1 kHz, 16-bit PCM). Downstream interpret rules call from src/analyze.py.
out/<name>.wavanalyze.load_mono(path)每个音效的WAV文件保存到(单声道,44.1 kHz,16位PCM)。后续解析规则调用src/analyze.py中的。
out/<name>.wavanalyze.load_mono(path)2.2 Extract fundamental frequency and pitch sweep (HIGH)
2.2 提取基频和音高变化 (HIGH)
Sample the spectrum at multiple time slices to detect both the static pitch and any sweep.
python
from analyze import load_mono, analyze_slice
sample_rate, data = load_mono("out/click.wav")
slices = [0, 5, 10, 20, 50] # ms
freqs_over_time = [analyze_slice(data, sample_rate, t) for t in slices]在多个时间切片采样频谱,检测静态音高和音高变化。
python
from analyze import load_mono, analyze_slice
sample_rate, data = load_mono("out/click.wav")
slices = [0, 5, 10, 20, 50] # ms
freqs_over_time = [analyze_slice(data, sample_rate, t) for t in slices]Mapping
映射
| Observation | Output |
|---|---|
| All slices within ±5% | |
| Decreasing across slices | |
| Increasing across slices | |
| 观察结果 | 输出 |
|---|---|
| 所有切片偏差在±5%以内 | |
| 切片间频率递减 | |
| 切片间频率递增 | |
Tips
提示
- Skip the first 1-2 ms if the onset is a click transient; it pollutes the FFT.
- For very short sounds (< 20 ms) use fewer slices and a smaller window.
- Use a Hanning window before FFT (already applied in ) to reduce spectral leakage.
analyze_slice
- 如果起始部分是点击瞬态,跳过前1-2 ms,避免污染FFT结果。
- 对于极短音效(<20 ms),使用更少切片和更小窗口。
- FFT前使用汉宁窗口(中已应用)减少频谱泄漏。
analyze_slice
2.3 Extract ADSR envelope from amplitude (HIGH)
2.3 从振幅提取ADSR包络 (HIGH)
Smooth the time-domain amplitude, find onset/peak/sustain/end, and derive each ADSR stage.
python
from analyze import load_mono, extract_envelope
sample_rate, data = load_mono("out/click.wav")
env = extract_envelope(data, sample_rate)平滑时域振幅,找到起始/峰值/持续/结束点,推导每个ADSR阶段。
python
from analyze import load_mono, extract_envelope
sample_rate, data = load_mono("out/click.wav")
env = extract_envelope(data, sample_rate)-> { "attack": 0.0008, "decay": 0.012, "sustain": 0.0, "release": 0.005 }
-> { "attack": 0.0008, "decay": 0.012, "sustain": 0.0, "release": 0.005 }
undefinedundefinedOutput shape
输出结构
The dict maps 1:1 to the type:
Envelopets
envelope: {
attack: env.attack, // 0 if percussive
decay: env.decay,
sustain: env.sustain, // 0 for transient sounds, 0-1 for sustained
release: env.release,
}该字典与类型1:1对应:
Envelopets
envelope: {
attack: env.attack, // 打击乐设为0
decay: env.decay,
sustain: env.sustain, // 瞬态音效设为0,持续音效设为0-1
release: env.release,
}Heuristics
启发式规则
- -> drop the field; the sound is percussive.
sustain < 0.01 - -> set
attack < 0.001.attack: 0 - -> clamp to
release < 0.005to avoid clicks at the end.0.005
- -> 移除该字段,音效为打击乐类型。
sustain < 0.01 - -> 设置
attack < 0.001.attack: 0 - -> 限制为
release < 0.005,避免结尾出现咔哒声。0.005
2.4 Classify oscillator waveform from harmonics (HIGH)
2.4 从谐波分类振荡器波形 (HIGH)
Compare the amplitude of the first 8 harmonics against the fundamental.
python
import numpy as np
from scipy.fft import rfft, rfftfreq
from analyze import classify_waveform
segment = data[:int(sample_rate * 0.02)].astype(float)
segment *= np.hanning(len(segment))
spectrum = np.abs(rfft(segment))
freqs = rfftfreq(len(segment), 1 / sample_rate)
waveform = classify_waveform(spectrum, freqs, fundamental_freq)对比前8次谐波与基频的振幅。
python
import numpy as np
from scipy.fft import rfft, rfftfreq
from analyze import classify_waveform
segment = data[:int(sample_rate * 0.02)].astype(float)
segment *= np.hanning(len(segment))
spectrum = np.abs(rfft(segment))
freqs = rfftfreq(len(segment), 1 / sample_rate)
waveform = classify_waveform(spectrum, freqs, fundamental_freq)-> "sine" | "triangle" | "square" | "sawtooth" | "wavetable"
-> "sine" | "triangle" | "square" | "sawtooth" | "wavetable"
undefinedundefinedMapping
映射
| Pattern | |
|---|---|
| Fundamental only, harmonics < -40 dB | |
| Odd harmonics rolling off as 1/n | |
| Odd harmonics at roughly equal amplitude | |
| All harmonics rolling off as 1/n | |
| Custom harmonic profile (none of the above) | |
| No clear harmonic structure, broadband energy | |
| 模式 | |
|---|---|
| 仅基频,谐波<-40 dB | |
| 奇次谐波按1/n衰减 | |
| 奇次谐波振幅大致相等 | |
| 所有谐波按1/n衰减 | |
| 自定义谐波分布(不符合以上任何一种) | |
| 无清晰谐波结构,宽频能量 | |
When to fall back to wavetable
何时回退到wavetable
If the harmonic profile doesn't match a clean oscillator, extract the harmonic series instead:
python
from analyze import extract_harmonics
harmonics = extract_harmonics(spectrum, freqs, fundamental_freq, num_harmonics=16)如果谐波分布不符合标准振荡器,提取谐波序列:
python
from analyze import extract_harmonics
harmonics = extract_harmonics(spectrum, freqs, fundamental_freq, num_harmonics=16)-> { source: { type: "wavetable", harmonics, frequency: fundamental_freq } }
-> { source: { type: "wavetable", harmonics, frequency: fundamental_freq } }
undefinedundefinedNoise color
噪声色彩
For broadband signals with no fundamental, classify by spectral slope:
python
from analyze import classify_noise_color
color = classify_noise_color(spectrum, freqs) # "white" | "pink" | "brown"对于无基频的宽频信号,按频谱斜率分类:
python
from analyze import classify_noise_color
color = classify_noise_color(spectrum, freqs) # "white" | "pink" | "brown"-> { source: { type: "noise", color } }
-> { source: { type: "noise", color } }
undefinedundefined2.5 Detect filter type, cutoff, and resonance (MEDIUM-HIGH)
2.5 检测滤波器类型、截止频率和共振 (MEDIUM-HIGH)
Compare the measured spectrum against the expected spectrum for the identified oscillator.
对比测量频谱与已识别振荡器的预期频谱。
Cutoff via spectral centroid
通过频谱重心确定截止频率
python
from analyze import spectral_centroid
centroid = spectral_centroid(spectrum, freqs)Expected centroids at a 440 Hz fundamental: sine ~440, triangle ~880, sawtooth ~2200, square ~1760. If the measured centroid is significantly lower than expected, a is present; estimate cutoff at the -3 dB point.
lowpasspython
from analyze import spectral_centroid
centroid = spectral_centroid(spectrum, freqs)基频440 Hz时的预期重心:正弦波440,三角波880,锯齿波2200,方波1760。如果测量重心远低于预期,说明存在滤波器;在-3 dB点估算截止频率。
lowpassFilter type from rolloff
从衰减斜率判断滤波器类型
| Observation | |
|---|---|
| High-frequency rolloff steeper than the source would produce | |
| Low-frequency rolloff | |
| Narrow band of frequencies passes through | |
| Narrow notch removed | |
| Resonant peak near cutoff | High |
| 观察结果 | |
|---|---|
| 高频衰减斜率比声源自身更陡峭 | |
| 低频衰减 | |
| 窄带频率通过 | |
| 窄带频率被移除 | |
| 截止频率附近存在共振峰 | 高 |
Resonance (Q)
共振(Q值)
python
from analyze import estimate_resonance
q = estimate_resonance(spectrum, freqs, cutoff_hz)python
from analyze import estimate_resonance
q = estimate_resonance(spectrum, freqs, cutoff_hz)Returns 0.1 - 20.0
返回0.1 - 20.0
undefinedundefinedFilter envelope
滤波器包络
If brightness changes over time (bright attack fading to dull), there's a filter envelope:
python
from analyze import detect_filter_envelope
env = detect_filter_envelope(data, sample_rate)如果亮度随时间变化(明亮起音逐渐变为低沉),说明存在滤波器包络:
python
from analyze import detect_filter_envelope
env = detect_filter_envelope(data, sample_rate)-> { "peak": 4000, "resting": 800, "decay_ms": 50 } or None
-> { "peak": 4000, "resting": 800, "decay_ms": 50 } 或 None
Maps to:
```ts
filter: {
type: "lowpass",
frequency: env.resting,
envelope: { attack: 0, peak: env.peak, decay: env.decay_ms / 1000 },
}
映射为:
```ts
filter: {
type: "lowpass",
frequency: env.resting,
envelope: { attack: 0, peak: env.peak, decay: env.decay_ms / 1000 },
}2.6 Detect post-source effects (MEDIUM)
2.6 检测声源后效果 (MEDIUM)
Each detector returns a confidence-flavored hint, not a guarantee. Effects are harder to extract than source/envelope - report low confidence when ambiguous.
每个检测器返回带置信度的提示,而非绝对结论。效果提取比声源/包络更难——结果模糊时报告低置信度。
Reverb
混响
python
from analyze import detect_reverb
result = detect_reverb(data, sample_rate, envelope_end_ms=120)python
from analyze import detect_reverb
result = detect_reverb(data, sample_rate, envelope_end_ms=120)-> { "type": "reverb", "decay": 0.6 } or None
-> { "type": "reverb", "decay": 0.6 } 或 None
undefinedundefinedDelay (autocorrelation)
延迟(自相关)
python
from analyze import detect_delay
result = detect_delay(data, sample_rate)python
from analyze import detect_delay
result = detect_delay(data, sample_rate)-> { "type": "delay", "time": 0.25, "feedback": 0.3 } or None
-> { "type": "delay", "time": 0.25, "feedback": 0.3 } 或 None
undefinedundefinedFM synthesis
FM合成
Spectral sidebands at non-integer ratios of the fundamental indicate FM:
python
from analyze import detect_fm
fm = detect_fm(spectrum, freqs, fundamental_freq)基频非整数倍的频谱边带表明存在FM:
python
from analyze import detect_fm
fm = detect_fm(spectrum, freqs, fundamental_freq)-> { "fm": { "ratio": 0.5, "depth": 80 } } or None
-> { "fm": { "ratio": 0.5, "depth": 80 } } 或 None
Maps to `source.fm: { ratio, depth }` (not a separate effect).
映射为`source.fm: { ratio, depth }`(不是独立效果)。Tremolo and vibrato
颤音和震音
Periodic amplitude or frequency modulation in the 1-20 Hz band suggests tremolo/vibrato. Track amplitude or pitch over time and call (see ).
detect_lfointerpret-detect-lfo1-20 Hz频段的周期性振幅或频率调制表明存在颤音/震音。随时间跟踪振幅或音高,调用(参考)。
detect_lfointerpret-detect-lfoBitcrusher / distortion
比特压缩器/失真
| Time-domain signature | Effect |
|---|---|
| Stepped/quantized waveform with aliasing artifacts | |
| Flat-topped waveform with added harmonics | |
| 时域特征 | 效果类型 |
|---|---|
| 带混叠伪影的阶梯状/量化波形 | |
| 平顶波形并添加谐波 | |
Chorus / flanger / phaser
合唱/镶边/移相
Comb-filter pattern that sweeps over time produces moving notches in the spectrum. Hard to disambiguate algorithmically; flag for human review.
随时间变化的梳状滤波器模式会在频谱中产生移动的陷波。算法难以区分,标记为需人工审核。
2.7 Detect LFO modulation (LOW-MEDIUM)
2.7 检测LFO调制 (LOW-MEDIUM)
An LFO is sub-audio (0.1-20 Hz) periodic modulation of a parameter. Track the parameter over time, then run .
detect_lfopython
from analyze import detect_lfoLFO是亚音频(0.1-20 Hz)的周期性参数调制。随时间跟踪参数,然后运行。
detect_lfopython
from analyze import detect_lfo1. Track amplitude (or pitch, or spectral centroid) at regular intervals
1. 定期跟踪振幅(或音高、频谱重心)
window_ms = 10
samples_per_window = int(sample_rate * window_ms / 1000)
amp_over_time = [
float(np.max(np.abs(data[i:i + samples_per_window])))
for i in range(0, len(data) - samples_per_window, samples_per_window)
]
window_ms = 10
samples_per_window = int(sample_rate * window_ms / 1000)
amp_over_time = [
float(np.max(np.abs(data[i:i + samples_per_window])))
for i in range(0, len(data) - samples_per_window, samples_per_window)
]
2. Detect periodicity
2. 检测周期性
lfo = detect_lfo(np.array(amp_over_time), 1000 / window_ms)
lfo = detect_lfo(np.array(amp_over_time), 1000 / window_ms)
-> { "frequency": 5.0, "depth": 0.12 } or None
-> { "frequency": 5.0, "depth": 0.12 } 或 None
undefinedundefinedMapping by tracked parameter
按跟踪参数映射
| Parameter tracked | LFO target |
|---|---|
| Amplitude | |
| Pitch | |
| Spectral centroid | |
| Pan position | |
| 跟踪参数 | LFO目标 |
|---|---|
| 振幅 | |
| 音高 | |
| 频谱重心 | |
| 声像位置 | |
Output
输出
ts
lfo: { type: "sine", frequency: lfo.frequency, depth: lfo.depth, target: "gain" }Pick based on the shape of the modulation: smooth sinusoid -> , sharp ramp -> , hard switching -> .
typesinesawtoothsquarets
lfo: { type: "sine", frequency: lfo.frequency, depth: lfo.depth, target: "gain" }根据调制形状选择:平滑正弦曲线->,尖锐斜坡->,硬切换->。
typesinesawtoothsquare2.8 Detect multi-layer sounds and stereo positioning (MEDIUM)
2.8 检测多层音效和立体声定位 (MEDIUM)
Multiple fundamentals -> MultiLayerSound
多基频->MultiLayerSound
Inspect peaks in the spectrum. If two or more strong peaks are not integer multiples of one shared fundamental, the sound is layered.
python
from scipy.signal import find_peaks
peaks, props = find_peaks(spectrum, height=float(np.max(spectrum)) * 0.2)
peak_freqs = sorted(freqs[peaks])检查频谱峰值。如果两个或多个强峰值不是同一基频的整数倍,说明音效是分层的。
python
from scipy.signal import find_peaks
peaks, props = find_peaks(spectrum, height=float(np.max(spectrum)) * 0.2)
peak_freqs = sorted(freqs[peaks])Check pairwise ratios. If no shared fundamental explains all peaks, treat as layered.
检查两两比率。如果没有共同基频能解释所有峰值,则视为分层音效。
For each detected fundamental, run the full pipeline (frequency, envelope, waveform, filter, effects) and emit one `Layer` per fundamental:
```ts
{
layers: [
{ source: { ... }, envelope: { ... }, gain: 0.2 },
{ source: { ... }, envelope: { ... }, gain: 0.15, delay: 0.04 },
]
}The earlier layer typically gets (omitted); subsequent layers offset their to match the measured onset gap.
delay: 0delay
对每个检测到的基频,运行完整流程(频率、包络、波形、滤波器、效果),为每个基频生成一个`Layer`:
```ts
{
layers: [
{ source: { ... }, envelope: { ... }, gain: 0.2 },
{ source: { ... }, envelope: { ... }, gain: 0.15, delay: 0.04 },
]
}第一层通常设置(可省略);后续层通过偏移匹配测量到的起始间隙。
delay: 0delayStereo and pan
立体声和声像
python
from analyze import analyze_stereo
stereo = analyze_stereo(data)python
from analyze import analyze_stereo
stereo = analyze_stereo(data)-> { "pan": 0.3, "stereo_width": 0.7 }
-> { "pan": 0.3, "stereo_width": 0.7 }
| `pan` magnitude | Output |
| --------------- | ------------------------------- |
| `< 0.05` | omit (`pan: 0` is default) |
| `0.05 - 1` | `pan: <value>` |
`stereo_width > 0.5` with `|pan| < 0.05` suggests a stereo effect (chorus, dual-layer). Consider splitting into two layers panned `-0.5` / `+0.5`.
| `pan`绝对值 | 输出 |
| ------------ | ------------------------------- |
| `< 0.05` | 省略(默认`pan: 0`) |
| `0.05 - 1` | `pan: <value>` |
`stereo_width > 0.5`且`|pan| < 0.05`表明存在立体声效果(合唱、双层)。可考虑拆分为两个声像为`-0.5`/`+0.5`的层。Fallback
回退方案
If a sound is unsynthesizable (complex transients, recorded material, irreducible texture), fall back to:
ts
{ source: { type: "sample", url: "..." } }and note that the original audio file should be used directly rather than re-synthesized.
如果音效无法合成(复杂瞬态、录制素材、不可简化的纹理),回退到:
ts
{ source: { type: "sample", url: "..." } }并说明应直接使用原始音频文件而非重新合成。
3. UI Event Recipes
3. UI事件模板
Concrete SoundDefinition templates per UI event class. Used by the prompt path as the base layer.
每个UI事件类别的具体SoundDefinition模板。作为提示路径的基础层使用。
3.1 Click - sine + low FM, very short decay (HIGH)
3.1 Click - 正弦波+弱FM,极短衰减 (HIGH)
A short ascending sine sweep with light FM. The sweep gives the click "snap"; the FM adds harmonic body without making it metallic.
Incorrect (decay too long, sounds like a chime):
ts
{ source: { type: "sine", frequency: 1300 }, envelope: { decay: 0.5 }, gain: 0.18 }Correct:
ts
{
source: { type: "sine", frequency: { start: 200, end: 700 }, fm: { ratio: 0.5, depth: 80 } },
envelope: { attack: 0, decay: 0.06, sustain: 0, release: 0.02 },
gain: 0.25,
}Reference: .web-kits/core.ts .
click短升调正弦扫频加轻量FM。扫频赋予点击“脆感”;FM添加谐波质感但不产生金属感。
错误示例(衰减过长,听起来像钟鸣):
ts
{ source: { type: "sine", frequency: 1300 }, envelope: { decay: 0.5 }, gain: 0.18 }正确示例:
ts
{
source: { type: "sine", frequency: { start: 200, end: 700 }, fm: { ratio: 0.5, depth: 80 } },
envelope: { attack: 0, decay: 0.06, sustain: 0, release: 0.02 },
gain: 0.25,
}参考:.web-kits/core.ts 。
click3.2 Complete - four-note ascending arpeggio (MEDIUM-HIGH)
3.2 Complete - 四音升调琶音 (MEDIUM-HIGH)
Same C-major triad as , but with C6 added on top and tighter 15 ms increments so the notes blur into a single gesture rather than reading as discrete pitches.
successdelayReference: .web-kits/core.ts .
complete与使用相同的C大调和弦,但顶部添加C6,且增量更紧凑(15 ms),使音符融合为单个动作而非离散音调。
successdelay参考:.web-kits/core.ts 。
complete3.3 Error - layered sawtooth + square with descending sweep (HIGH)
3.3 Error - 分层锯齿波+方波加降调扫频 (HIGH)
Two descending sweeps stacked an octave apart. Lowpass filters keep the result from being abrasive. Same shape works for (slightly longer decay).
deleteIncorrect (no filter, sounds like a buzzer):
ts
{ source: { type: "sawtooth", frequency: { start: 320, end: 140 } }, envelope: { decay: 0.25 }, gain: 0.22 }Reference: .web-kits/core.ts , .
error_delete两个降调扫频叠加一个八度。低通滤波器避免结果刺耳。相同结构适用于(衰减稍长)。
delete错误示例(无滤波器,听起来像蜂鸣器):
ts
{ source: { type: "sawtooth", frequency: { start: 320, end: 140 } }, envelope: { decay: 0.25 }, gain: 0.22 }参考:.web-kits/core.ts , 。
error_delete3.4 Modal-close - downward sine sweep (MEDIUM)
3.4 Modal-close - 降调正弦扫频 (MEDIUM)
The inverse of . Range is narrower because dismiss should feel less assertive than the entrance. Slightly lower for the same reason.
modalOpengainFor use 800 -> 350. For use 900 -> 500.
drawer-closedropdown-closeReference: .web-kits/core.ts , , .
modalClosedrawerClosedropdownClosemodalOpengaindrawer-closedropdown-close参考:.web-kits/core.ts , , 。
modalClosedrawerClosedropdownClose3.5 Modal-open - upward sine sweep (MEDIUM)
3.5 Modal-open - 升调正弦扫频 (MEDIUM)
A single sine sweeping from ~430 Hz up to ~1400 Hz over 80 ms. No FM, no filter; the cleanness signals "appearing".
For use a slightly lower start (~350 Hz) and lower gain (~0.08).
For use a smaller range (500 -> 1200) and decay ~60 ms.
drawer-opendropdown-openReference: .web-kits/core.ts , , .
modalOpendrawerOpendropdownOpen单个正弦波在80 ms内从约430 Hz扫到约1400 Hz。无FM、无滤波器;纯净度表明“出现”。
drawer-opengaindropdown-open参考:.web-kits/core.ts , , 。
modalOpendrawerOpendropdownOpen3.6 Notification - FM-rich sine with light reverb (HIGH)
3.6 Notification - 富FM正弦波加轻量混响 (HIGH)
Two FM bells a fifth apart with 100 ms between them. The gives an inharmonic shimmer; the matched reverb on each layer glues them together.
delayfm.ratio: 1.5For : single layer, , reverb .
For : lower fundamental (660 Hz), , slightly more attack.
dingfm.ratio: 3.5decay: 0.8mentionfm.ratio: 2.5Reference: .web-kits/core.ts , , , .
notificationdingmentionbadge两个相差五度的FM钟音,间隔100 ms 。产生非谐波闪烁感;每层匹配的混响将它们融合在一起。
delayfm.ratio: 1.5dingfm.ratio: 3.5decay: 0.8mentionfm.ratio: 2.5参考:.web-kits/core.ts , , , 。
notificationdingmentionbadge3.7 Success - ascending three-note sine chord (HIGH)
3.7 Success - 升调三音正弦和弦 (HIGH)
Three sine layers at C5 / E5 / G5 with cascading 0.07 s between them. The top note has a small upward sweep (G5 -> A5) so the chord resolves "upward" instead of just stopping.
delayLayer gains sum to 0.45, comfortably under the 0.6 budget.
Reference: .web-kits/core.ts .
success三个正弦层分别为C5/E5/G5,级联间隔0.07 s。顶层有小幅度升调(G5 -> A5),使和弦向上解决而非停止。
delay层增益总和为0.45,远低于0.6的预算。
参考:.web-kits/core.ts 。
success3.8 Swoosh - white noise through a sweeping bandpass (MEDIUM)
3.8 Swoosh - 白噪声通过扫频带通滤波器 (MEDIUM)
White noise is shaped by a bandpass filter whose center frequency sweeps from 300 Hz up to 4 kHz. The sweep direction is the gesture: peak above resting = upward swoosh, peak below resting (e.g., resting 2500, peak 400) = downward.
For use a similar shape with peak 3500. For flip to pink noise with (no attack on the filter envelope).
slide-upslide-downenvelope: { decay: 0.12, peak: 500 }Reference: .web-kits/core.ts , , , .
swooshslideslideUpslideDown白噪声由带通滤波器塑形,中心频率从300 Hz扫到4 kHz。扫频方向对应动作:峰值高于静止值->向上swoosh,峰值低于静止值(例如静止2500,峰值400)->向下swoosh。
slide-upslide-downenvelope: { decay: 0.12, peak: 500 }参考:.web-kits/core.ts , , , 。
swooshslideslideUpslideDown3.9 Tap - static high sine + FM, ultra short (HIGH)
3.9 Tap - 静态高正弦波+FM,超短时长 (HIGH)
Single high pitch (no sweep), aggressive FM, decay under 20 ms. This is the "key-press" archetype.
Incorrect (frequency too low, sounds like a thump):
ts
{ source: { type: "sine", frequency: 200 }, envelope: { decay: 0.015 }, gain: 0.2 }Correct:
ts
{
source: { type: "sine", frequency: 1300, fm: { ratio: 0.5, depth: 100 } },
envelope: { attack: 0, decay: 0.015, sustain: 0, release: 0.005 },
gain: 0.2,
}Reference: .web-kits/core.ts , .
tapkeyPress单高音(无扫频),强FM,衰减小于20 ms。这是“按键”原型。
错误示例(频率过低,听起来像重击):
ts
{ source: { type: "sine", frequency: 200 }, envelope: { decay: 0.015 }, gain: 0.2 }正确示例:
ts
{
source: { type: "sine", frequency: 1300, fm: { ratio: 0.5, depth: 100 } },
envelope: { attack: 0, decay: 0.015, sustain: 0, release: 0.005 },
gain: 0.2,
}参考:.web-kits/core.ts , 。
tapkeyPress3.10 Tick - faintest possible sine (MEDIUM)
3.10 Tick - 极微弱正弦波 (MEDIUM)
Highest frequency in the tap family. Decay under 15 ms. capped at ~0.15 because ticks fire often and must not dominate.
gainFor scroll-snap reduce to 0.08; for focus/blur reduce to 0.04-0.06.
gainReference: .web-kits/core.ts , , , .
tickscrollSnapfocusblurTap家族中频率最高的音效。衰减小于15 ms。上限约0.15,因为tick频繁触发,不能过于突出。
gainscroll-snap将降至0.08;focus/blur降至0.04-0.06。
gain参考:.web-kits/core.ts , , , 。
tickscrollSnapfocusblur3.11 Toggle - paired sines with delay (direction matters) (MEDIUM)
3.11 Toggle - 带延迟的配对正弦波(方向重要) (MEDIUM)
Two short sines: C7 (2093 Hz) and G7 (3136 Hz), 25 ms apart.
- : low note first, then high (ascending = enabling).
toggle-on - : high note first, then low (descending = disabling).
toggle-off
The same architecture works for (1200 Hz then 1400 Hz, 40 ms gap) and (C5 then G5).
copysyncReference: .web-kits/core.ts , , , .
toggleOntoggleOffcopysync两个短正弦波:C7(2093 Hz)和G7(3136 Hz),间隔25 ms。
- :先低音后高音(升调=启用)。
toggle-on - :先高音后低音(降调=禁用)。
toggle-off
相同结构适用于(1200 Hz后接1400 Hz,间隔40 ms)和(C5后接G5)。
copysync参考:.web-kits/core.ts , , , 。
toggleOntoggleOffcopysync3.12 Whoosh - longer, slower swoosh for full-page transitions (LOW-MEDIUM)
3.12 Whoosh - 更长更慢的swoosh,用于整页过渡 (LOW-MEDIUM)
Same architecture as but everything stretches. Filter attack is 4x longer (0.04 s vs 0.01 s) so the gesture starts gently. Slightly higher because it spans a longer time window.
swooshgainpageEnterpageExitReference: .web-kits/core.ts , , .
whooshpageEnterpageExit与结构相同,但所有参数延长。滤波器起音是原来的4倍(0.04 s vs 0.01 s),使动作开始更柔和。稍高,因为持续时间更长。
swooshgainpageEnterpageExit参考:.web-kits/core.ts , , 。
whooshpageEnterpageExit4. Mood Vocabulary
4. 情绪词汇
Adjective-to-knob mappings layered onto the base recipe.
形容词到参数的映射,叠加到基础模板上。
4.1 Airy - noise source + bandpass with high peak (LOW-MEDIUM)
4.1 Airy(空灵)- 噪声源+高峰值带通滤波器 (LOW-MEDIUM)
Mutation:
- Replace with
source.{ type: "noise", color: "white" } - Replace with bandpass envelope reaching a high peak (4-6 kHz).
filter - Lengthen to 0.02-0.04 s so the result fades in rather than snapping.
envelope.attack - Lower to 0.08-0.12.
gain
If the base was tonal (sine, triangle, etc.), this mood replaces the source entirely - it's a structural change.
修改:
- 将替换为
source。{ type: "noise", color: "white" } - 将替换为带通包络,峰值达4-6 kHz。
filter - 将延长至0.02-0.04 s,使音效淡入而非突然出现。
envelope.attack - 将降至0.08-0.12。
gain
如果基础是 tonal(正弦波、三角波等),此情绪会完全替换声源——这是结构性变化。
4.2 Bright - no lowpass, optional FM sparkle (MEDIUM)
4.2 Bright(明亮)- 无低通滤波器,可选FM闪烁 (MEDIUM)
Mutation:
- Remove any of type
filter, or raise its cutoff above 6 kHz.lowpass - If the base used , upgrade to
trianglewithsinefor sparkle.fm: { ratio: 2.5, depth: 50 } - Slight bump (+0.02) is fine but stay under the budget.
gain
修改:
- 移除任何类型的
lowpass,或将其截止频率提高到6 kHz以上。filter - 如果基础使用,升级为
triangle并添加sine以增加闪烁感。fm: { ratio: 2.5, depth: 50 } - 可小幅提高(+0.02),但需保持在预算内。
gain
4.3 Glassy - high FM ratio + reverb (MEDIUM)
4.3 Glassy(玻璃质感)- 高FM比率+混响 (MEDIUM)
Mutation:
- .
source.type: "sine" - .
source.fm: { ratio: 3.5, depth: 200-300 } - Append .
effects: [{ type: "reverb", decay: 0.7, damping: 0.5, mix: 0.15 }] - Extend to at least 0.3 s so the bell can ring.
envelope.decay
Reference: .web-kits/core.ts , , .
dingsparklestar修改:
- 。
source.type: "sine" - 。
source.fm: { ratio: 3.5, depth: 200-300 } - 添加。
effects: [{ type: "reverb", decay: 0.7, damping: 0.5, mix: 0.15 }] - 将延长至至少0.3 s,使钟音能够持续。
envelope.decay
参考:.web-kits/core.ts , , 。
dingsparklestar4.4 Lo-fi - bitcrusher + lowpass (MEDIUM)
4.4 Lo-fi(低保真)- 比特压缩器+低通滤波器 (MEDIUM)
Mutation:
- Add .
filter: { type: "lowpass", frequency: 1500 } - Append .
effects: [{ type: "bitcrusher", bits: 6-8, mix: 0.7-1 }] - Optionally drop by 0.02 because bitcrushing adds perceived loudness.
gain
Combines well with .
mood-retro修改:
- 添加。
filter: { type: "lowpass", frequency: 1500 } - 添加。
effects: [{ type: "bitcrusher", bits: 6-8, mix: 0.7-1 }] - 可选将降低0.02,因为比特压缩会增加感知响度。
gain
与搭配效果良好。
mood-retro4.5 Metallic - inharmonic FM ratio (MEDIUM)
4.5 Metallic(金属质感)- 非谐波FM比率 (MEDIUM)
Mutation:
- (or
source.type: "sine"for a harsher result).square - - 2.76 is the inharmonic ratio used by
source.fm: { ratio: 2.76, depth: 300-400 }inbadgeand reads as bell-metal..web-kits/core.ts - Short release; metallic shouldn't sustain.
Avoid stacking with - they cancel each other out.
mood-warmReference: .web-kits/core.ts .
badge修改:
- (或
source.type: "sine"以获得更刺耳的效果)。square - ——2.76是
source.fm: { ratio: 2.76, depth: 300-400 }中.web-kits/core.ts使用的非谐波比率,听起来像钟金属声。badge - 短释放时间;金属质感不应持续。
避免与叠加——两者会相互抵消。
mood-warm参考:.web-kits/core.ts 。
badge4.6 Organic - triangle + slight detune + light reverb (LOW-MEDIUM)
4.6 Organic(自然质感)- 三角波+轻微失谐+轻量混响 (LOW-MEDIUM)
Mutation:
- .
source.type: "triangle" - Add for very slight pitch wobble.
source.detune: 5-10 - Bump from 0 to 0.003-0.008 s so the onset isn't a hard click.
envelope.attack - Append a small reverb ().
mix: 0.05-0.1
Combines well with . Avoid combining with or - they fight the natural feel.
mood-warmmood-metallicmood-lofi修改:
- 。
source.type: "triangle" - 添加以获得极轻微的音高摆动。
source.detune: 5-10 - 将从0提高到0.003-0.008 s,使起始不是生硬的点击。
envelope.attack - 添加小幅度混响()。
mix: 0.05-0.1
与搭配效果良好。避免与或叠加——它们会破坏自然感。
mood-warmmood-metallicmood-lofi4.7 Punchy - zero attack, very short decay (MEDIUM)
4.7 Punchy(有冲击力)- 零起音,极短衰减 (MEDIUM)
Mutation:
- .
envelope.attack: 0 - .
envelope.decay: <= 0.06 - .
envelope.sustain: 0 - .
envelope.release: <= 0.015 - bump of +0.05 is fine because the energy lives in a shorter window.
gain
Orthogonal to source-shape moods - apply on top of warm/bright/glassy/metallic.
修改:
- 。
envelope.attack: 0 - 。
envelope.decay: <= 0.06 - 。
envelope.sustain: 0 - 。
envelope.release: <= 0.015 - 可提高+0.05,因为能量集中在更短的时间窗口。
gain
与声源形态情绪正交——可叠加在warm/bright/glassy/metallic之上。
4.8 Retro - square or sawtooth + lowpass + bitcrusher (MEDIUM)
4.8 Retro(复古)- 方波或锯齿波+低通滤波器+比特压缩器 (MEDIUM)
Mutation:
- (or
source.type: "square")."sawtooth" - Add to soften aliasing.
filter: { type: "lowpass", frequency: 3000 } - Append .
effects: [{ type: "bitcrusher", bits: 8, sampleRateReduction: 2-4, mix: 1 }]
Pairs naturally with rising or stepped pitch sweeps (coins, power-ups).
修改:
- (或
source.type: "square")。"sawtooth" - 添加以柔化混叠。
filter: { type: "lowpass", frequency: 3000 } - 添加。
effects: [{ type: "bitcrusher", bits: 8, sampleRateReduction: 2-4, mix: 1 }]
自然搭配上升或阶梯式音高扫频(硬币、升级)。
4.9 Warm - lowpass + light reverb (MEDIUM)
4.9 Warm(温暖)- 低通滤波器+轻量混响 (MEDIUM)
Mutation applied on top of the base recipe:
- Add (or 2-3 kHz).
filter: { type: "lowpass", frequency: 2500 } - Optionally add .
effects: [{ type: "reverb", decay: 0.4, mix: 0.1 }] - If the base used or
sawtooth, downgrade tosquareso the source itself is rounder.triangle
If the base already had a lowpass, lower its cutoff by ~30%.
在基础模板上应用修改:
- 添加(或2-3 kHz)。
filter: { type: "lowpass", frequency: 2500 } - 可选添加。
effects: [{ type: "reverb", decay: 0.4, mix: 0.1 }] - 如果基础使用或
sawtooth,降级为square,使声源本身更圆润。triangle
如果基础已有低通滤波器,将其截止频率降低约30%。
5. Layering Patterns
5. 分层模式
When to use one layer vs two vs a chord stack.
何时使用单层、双层或和弦堆叠。
5.1 Ascending chord - 3-4 layers with cascading delay (MEDIUM)
5.1 升调和弦 - 3-4层带级联延迟 (MEDIUM)
3-4 sine layers spelling out a major triad (C-E-G or C-E-G-C). increments by ~70 ms for "feels like notes" or ~15 ms for "feels like one gesture".
delayTop layer gets a small upward sweep so the chord resolves rather than stops.
Cap layer count at 4. Layer gains should sum to <= 0.6. If a layer has , all layers should have similar sustain values to avoid staggered ringing.
sustain > 03-4个正弦层构成大调和弦(C-E-G或C-E-G-C)。增量约70 ms时“听起来像独立音符”,约15 ms时“听起来像单个动作”。
delay顶层添加小幅度升调,使和弦解决而非停止。
层数上限为4层。层增益总和应<=0.6。如果某层,所有层应具有相似的sustain值,避免交错持续。
sustain > 05.2 Click + body - transient layer over a sustained tone (MEDIUM)
5.2 Click + body - 瞬态层叠加持续音调 (MEDIUM)
Two layers fired simultaneously (no ):
delay- High-frequency transient (3-5 kHz) with sub-10 ms decay - the "stick".
- Lower-frequency body (80-300 Hz) with longer decay - the "drum".
Used for: send buttons, hard confirms, drum-like UI feedback, anything that needs perceived weight. Both layers use the same source (usually ) so they read as one event.
typesineGains should be roughly balanced (transient slightly quieter than body).
两层同时触发(无):
delay- 高频瞬态(3-5 kHz),衰减<10 ms——“敲击声”。
- 低频主体(80-300 Hz),衰减更长——“鼓声”。
用于:发送按钮、确认操作、鼓类UI反馈、任何需要感知重量的场景。两层使用相同的声源(通常是),使它们被视为同一事件。
typesine增益应大致平衡(瞬态层稍低于主体层)。
5.3 Octave pair - two layers an octave apart with delay (MEDIUM)
5.3 八度配对 - 两层相差八度带延迟 (MEDIUM)
Two layers a fifth or octave apart, separated by 20-50 ms . Direction (low first vs high first) encodes "on" vs "off", "open" vs "close", etc.
delayLayer gains should sum to less than 0.5. Both envelopes should match so the second beat doesn't sound disconnected.
If you find yourself reaching for >2 layers, jump to instead.
layer-ascending-chord两层相差五度或八度,间隔20-50 ms 。顺序(先低音后高音vs先高音后低音)编码“开”vs“关”、“打开”vs“关闭”等状态。
delay层增益总和应小于0.5。两层包络应匹配,避免第二个节拍听起来脱节。
如果需要超过2层,直接使用。
layer-ascending-chord5.4 Single layer - emit Layer directly (HIGH)
5.4 单层 - 直接输出Layer (HIGH)
When the recipe needs only one source, emit the shape directly (not wrapped in ). The engine accepts both, but the bare-Layer form is the canonical compact representation.
Layer{ layers: [...] }ts
const sound: SoundDefinition = {
source: { type: "sine", frequency: 1300 },
envelope: { decay: 0.012, release: 0.004 },
gain: 0.18,
};Use this for: click, tap, tick, hover, focus, blur, scroll-snap, single-tone notifications, simple swooshes.
当模板仅需一个声源时,直接输出结构(不包裹在中)。引擎支持两种格式,但裸Layer形式是标准紧凑表示。
Layer{ layers: [...] }ts
const sound: SoundDefinition = {
source: { type: "sine", frequency: 1300 },
envelope: { decay: 0.012, release: 0.004 },
gain: 0.18,
};用于:click、tap、tick、hover、focus、blur、scroll-snap、单音通知、简单swoosh。
6. Effect Recipes
6. 效果模板
When and how to reach for each effect type.
何时及如何使用每种效果类型。
6.1 Bandpass noise swoosh - filter envelope is the gesture (MEDIUM)
6.1 带通噪声swoosh - 滤波器包络即动作 (MEDIUM)
Recipe is on the layer's , not its :
filtereffectsts
filter: {
type: "bandpass",
frequency: <resting Hz>,
resonance: 1-3,
envelope: { attack: 0.01-0.04, peak: <target Hz>, decay: 0.08-0.2 },
}- Peak above resting -> upward swoosh.
- Peak below resting -> downward swoosh.
- Higher (>2) makes it whistle-like; lower (<1.5) is broader.
resonance
Source should be (white for sharp, pink for soft). Source amplitude envelope just gates the noise window.
noise模板在层的上,而非:
filtereffectsts
filter: {
type: "bandpass",
frequency: <resting Hz>,
resonance: 1-3,
envelope: { attack: 0.01-0.04, peak: <target Hz>, decay: 0.08-0.2 },
}- 峰值高于静止值->向上swoosh。
- 峰值低于静止值->向下swoosh。
- 更高的(>2)使其类似哨音;更低的(<1.5)更宽泛。
resonance
声源应为(白噪声更尖锐,粉红噪声更柔和)。声源振幅包络仅控制噪声窗口。
noise6.2 Bitcrusher - retro / lofi finish (LOW-MEDIUM)
6.2 比特压缩器 - 复古/低保真收尾 (LOW-MEDIUM)
- : 4-8. Lower = more crunchy. Below 4 turns into noise.
bits - : 1 (off) to 8 (heavy aliasing). Combine with
sampleRateReductionfor that 8-bit console sound.bits: 8 - : usually 1. Mixing bitcrush with the dry signal sounds muddy.
mix
Best paired with or sources and a lowpass to soften the aliasing edges.
squaresawtoothAvoid stacking with - the quantization noise gets smeared.
effect-reverb-tail- : 4-8。值越低越有颗粒感。低于4会变成噪声。
bits - : 1(关闭)到8(重度混叠)。与
sampleRateReduction搭配可获得8位游戏机音效。bits: 8 - : 通常设为1。比特压缩与干信号混合会听起来浑浊。
mix
最佳搭配或声源,以及低通滤波器柔化混叠边缘。
squaresawtooth避免与叠加——量化噪声会被模糊。
effect-reverb-tail6.3 FM bell - high ratio, high depth (MEDIUM)
6.3 FM钟音 - 高比率,高深度 (MEDIUM)
source.fm: { ratio, depth }- : 2.5-3.5 for harmonic-bell, 2.76 for the "badge" inharmonic clang.
ratio - : 150-400. Higher depth = more strident.
depth - : at least 0.3 s so the bell can ring.
envelope.decay
For a bright "ding", use , and add reverb ().
ratio: 3.5depth: 250decay: 0.7, mix: 0.15For a dull "thud" with body, use , and a short envelope.
ratio: 0.5depth: 200Pair with or .
mood-glassymood-metallicsource.fm: { ratio, depth }- : 2.5-3.5为谐波钟音,2.76为“badge”非谐波 clang 声。
ratio - : 150-400。值越高越尖锐。
depth - : 至少0.3 s,使钟音能够持续。
envelope.decay
明亮的“ding”使用、并添加混响()。
ratio: 3.5depth: 250decay: 0.7, mix: 0.15低沉有质感的“thud”使用、并搭配短包络。
ratio: 0.5depth: 200与或搭配。
mood-glassymood-metallic6.4 Lowpass warmth - the safest filter to add (MEDIUM)
6.4 低通温暖感 - 最安全的滤波器添加方式 (MEDIUM)
ts
filter: { type: "lowpass", frequency: 2500, resonance: 0.7 }- : 1500-3000 Hz for "warm". Below 1000 starts muffling the sound.
frequency - : omit or set 0.7-1.5. Above 2 the cutoff itself starts to whistle.
resonance
Stacks safely with reverb, FM, and most moods. The fastest way to remove harshness from any source.
For dynamic warmth (bright attack -> warm sustain), add a filter envelope:
ts
filter: {
type: "lowpass",
frequency: 2500,
envelope: { attack: 0, peak: 6000, decay: 0.08 },
}ts
filter: { type: "lowpass", frequency: 2500, resonance: 0.7 }- : 1500-3000 Hz为“温暖”。低于1000 Hz开始模糊音效。
frequency - : 省略或设为0.7-1.5。高于2时截止频率本身会产生哨音。
resonance
可安全叠加混响、FM和大多数情绪。这是消除任何声源刺耳感的最快方法。
要获得动态温暖感(明亮起音->温暖持续),添加滤波器包络:
ts
filter: {
type: "lowpass",
frequency: 2500,
envelope: { attack: 0, peak: 6000, decay: 0.08 },
}6.5 Reverb tail - small space, low mix (MEDIUM)
6.5 混响尾音 - 小空间,低混合比 (MEDIUM)
Default UI reverb:
- : 0.3-0.6 s.
decay - : 0.4-0.6 (kills high frequencies in the tail; without this the reverb sounds metallic).
damping - : 0.08-0.15. Anything above 0.2 starts to feel like a music production effect.
mix
For per-layer reverb on bell-like sounds (notification, ding), put the reverb inside the layer's array so each note rings independently. For shared reverb on chords/transitions, put it on the top-level of the .
effectseffectsMultiLayerSoundAvoid stacking reverb with delay - choose one.
默认UI混响:
- : 0.3-0.6 s.
decay - : 0.4-0.6(消除尾音中的高频;否则混响会有金属感)。
damping - : 0.08-0.15。超过0.2开始像音乐制作效果。
mix
对于钟类音效(notification、ding)的每层混响,将混响放在层的数组中,使每个音符独立持续。对于和弦/过渡的共享混响,放在的顶层中。
effectsMultiLayerSoundeffects避免同时叠加混响和延迟——二选一。
7. Output Validation
7. 输出验证
Checks every emitted SoundDefinition must pass before returning to the user.
每个输出的SoundDefinition在返回用户前必须通过的检查。
7.1 Duration cap - 1 s for transients, 3 s absolute max (MEDIUM)
7.1 时长限制 - 瞬态音效1 s以内,绝对上限3 s (MEDIUM)
Estimated total duration:
estimated = (envelope.attack ?? 0)
+ envelope.decay
+ (envelope.release ?? 0)
+ max(0, longestEffectTail) // reverb decay, delay time * 4Targets:
- Click / tap / tick / hover / focus: <= 0.1 s.
- Toggle / copy / sync: <= 0.2 s.
- Modal / drawer / dropdown open/close: <= 0.3 s.
- Success / complete / notification: <= 0.8 s.
- Whoosh / page transition: <= 0.5 s.
Hard ceiling: 3 s. Anything longer should not be a UI sound.
The script computes the estimated duration and flags layers that exceed 3 s.
validate估算总时长:
estimated = (envelope.attack ?? 0)
+ envelope.decay
+ (envelope.release ?? 0)
+ max(0, longestEffectTail) // 混响衰减、延迟时间*4目标时长:
- Click/tap/tick/hover/focus: <=0.1 s.
- Toggle/copy/sync: <=0.2 s.
- Modal/drawer/dropdown开/关: <=0.3 s.
- Success/complete/notification: <=0.8 s.
- Whoosh/页面过渡: <=0.5 s.
硬上限:3 s。任何更长的音效都不应作为UI音效。
validate7.2 Envelope sanity - no zero decay, no infinite sustain without release (HIGH)
7.2 包络合理性 - 无零衰减,无无限持续而无释放 (HIGH)
Required:
- (always). Set to 0.005 minimum.
envelope.decay > 0 - If ,
envelope.sustain > 0must be present andenvelope.release.> 0
Recommended:
- : 0 for percussive, 0.003-0.05 for sustained tones, up to 0.1 for ambient sounds.
envelope.attack - : <= 2 s for any UI sound. Above that, you're writing music, not interface feedback.
envelope.decay + envelope.release - : 0 for transients, 0.03-0.15 for "rings out" tones, 0.3-0.7 only for held loops.
envelope.sustain
The script flags , without , and total durations above 3 s.
validatedecay <= 0sustain > 0release必填项:
- (始终)。最小值设为0.005。
envelope.decay > 0 - 如果,必须存在
envelope.sustain > 0且envelope.release。>0
推荐值:
- : 打击乐设为0,持续音调设为0.003-0.05,环境音效设为0.1。
envelope.attack - : 任何UI音效<=2 s。超过此值则属于音乐创作,而非界面反馈。
envelope.decay + envelope.release - : 瞬态音效设为0,“持续”音调设为0.03-0.15,仅循环音效设为0.3-0.7。
envelope.sustain
validatedecay <=0sustain >0release7.3 Frequency bounds - 20 Hz to 20 kHz, both ends meaningful (HIGH)
7.3 频率范围 - 20 Hz到20 kHz,两端均有意义 (HIGH)
Hard bounds:
- (or both
source.frequency/startof a sweep): 20 Hz <= f <= 20000 Hz.end - : 20 Hz <= f <= 20000 Hz.
filter.frequency - : same range as
filter.envelope.peak.filter.frequency
Recommended UI bounds:
- Tonal sources: 80 Hz <= f <= 8000 Hz.
- High transient layers (clicks, sticks): up to 5 kHz.
- Sub layers (body, drum): 60-200 Hz.
Anything above 8 kHz risks being inaudible on phone speakers; anything below 60 Hz risks being inaudible on laptop speakers.
The script flags any frequency outside the hard bounds.
validate硬范围:
- (或扫频的
source.frequency/start):20 Hz <=f <=20000 Hz.end - : 20 Hz <=f <=20000 Hz.
filter.frequency - : 与
filter.envelope.peak范围相同。filter.frequency
推荐UI范围:
- tonal声源:80 Hz <=f <=8000 Hz.
- 高频瞬态层(clicks、sticks):最高到5 kHz.
- 低频层(body、drum):60-200 Hz.
8 kHz以上的音效在手机扬声器上可能无法听见;60 Hz以下的音效在笔记本扬声器上可能无法听见。
validate7.4 Gain budget - keep total layer gain under 0.6 (HIGH)
7.4 增益预算 - 总层增益低于0.6 (HIGH)
Single layer:
- between 0.04 and 0.3 for typical UI events.
gain - Background ticks/scroll-snaps: 0.04-0.10.
- Mid-importance (click, tap, hover): 0.12-0.20.
- High-importance (success, notification): 0.16-0.25.
Multi-layer:
- Sum of all values must be <= 0.6.
layer.gain - If you exceed it, scale every layer proportionally rather than picking one to lower.
If a sound includes a heavy reverb () or distortion, lower the gain budget by 20%.
mix > 0.15The script flags both individual layers above 0.4 and totals above 0.6.
validate单层:
- 在0.04-0.3之间,适用于典型UI事件。
gain - 背景tick/scroll-snap: 0.04-0.10.
- 中等重要性(click、tap、hover): 0.12-0.20.
- 高重要性(success、notification): 0.16-0.25.
多层:
- 所有值总和必须<=0.6.
layer.gain - 如果超出,按比例缩放所有层,而非仅降低某一层。
如果音效包含重度混响()或失真,将增益预算降低20%。
mix >0.15validate7.5 Schema conformance - validate against patch.schema.json (CRITICAL)
7.5 schema一致性 - 验证patch.schema.json (CRITICAL)
Every emitted must validate against packages/audio/schemas/patch.schema.json ().
SoundDefinition#/$defs/SoundDefinitionCommon mistakes:
- Missing in
decay(required).envelope - Missing in
target(required).lfo - Setting outside
pan.[-1, 1] - Using a that isn't one of
filter.type.lowpass | highpass | bandpass | notch | allpass | peaking | lowshelf | highshelf | iir - Adding a top-level field that isn't in or
Layer(e.g.MultiLayerSound,name). The schema isdescription.additionalProperties: false - Confusing (chain on the mixed bus) with
MultiLayerSound.effects(chain on a single layer).Layer.effects
The script invokes the JSON Schema validator on every rule's field. Any violation aborts the build.
validateexample每个输出的必须通过packages/audio/schemas/patch.schema.json()的验证。
SoundDefinition#/$defs/SoundDefinition常见错误:
- 中缺少
envelope(必填)。decay - 中缺少
lfo(必填)。target - 设置超出
pan范围。[-1,1] - 使用的不属于
filter.type。lowpass | highpass | bandpass | notch | allpass | peaking | lowshelf | highshelf | iir - 添加了或
Layer中没有的顶层字段(如MultiLayerSound、name)。schema设置为description。additionalProperties: false - 混淆(混合总线链)与
MultiLayerSound.effects(单层层链)。Layer.effects
validateexample