songsee

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

songsee

songsee

Generate spectrograms and multi-panel audio feature visualizations from audio files.
从音频文件生成频谱图和多面板音频特征可视化。

Prerequisites

前提条件

Requires Go:
bash
go install github.com/steipete/songsee/cmd/songsee@latest
Optional:
ffmpeg
for formats beyond WAV/MP3.
需要安装Go
bash
go install github.com/steipete/songsee/cmd/songsee@latest
可选:若要处理WAV/MP3之外的格式,需安装
ffmpeg

Quick Start

快速开始

bash
undefined
bash
undefined

Basic spectrogram

基础频谱图

songsee track.mp3
songsee track.mp3

Save to specific file

保存到指定文件

songsee track.mp3 -o spectrogram.png
songsee track.mp3 -o spectrogram.png

Multi-panel visualization grid

多面板可视化网格

songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux

Time slice (start at 12.5s, 8s duration)

时间切片(从12.5秒开始,时长8秒)

songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg
songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg

From stdin

从标准输入读取

cat track.mp3 | songsee - --format png -o out.png
undefined
cat track.mp3 | songsee - --format png -o out.png
undefined

Visualization Types

可视化类型

Use
--viz
with comma-separated values:
TypeDescription
spectrogram
Standard frequency spectrogram
mel
Mel-scaled spectrogram
chroma
Pitch class distribution
hpss
Harmonic/percussive separation
selfsim
Self-similarity matrix
loudness
Loudness over time
tempogram
Tempo estimation
mfcc
Mel-frequency cepstral coefficients
flux
Spectral flux (onset detection)
Multiple
--viz
types render as a grid in a single image.
使用
--viz
参数并传入逗号分隔的类型值:
类型描述
spectrogram
标准频率频谱图
mel
Mel标度频谱图
chroma
音高类别分布
hpss
谐波/打击乐分离
selfsim
自相似矩阵
loudness
随时间变化的响度
tempogram
速度估计
mfcc
Mel频率倒谱系数
flux
频谱通量( onset检测)
传入多个
--viz
类型时,会在单张图片中以网格形式渲染。

Common Flags

常用参数

FlagDescription
--viz
Visualization types (comma-separated)
--style
Color palette:
classic
,
magma
,
inferno
,
viridis
,
gray
--width
/
--height
Output image dimensions
--window
/
--hop
FFT window and hop size
--min-freq
/
--max-freq
Frequency range filter
--start
/
--duration
Time slice of the audio
--format
Output format:
jpg
or
png
-o
Output file path
参数描述
--viz
可视化类型(逗号分隔)
--style
调色板:
classic
magma
inferno
viridis
gray
--width
/
--height
输出图片尺寸
--window
/
--hop
FFT窗口和步长
--min-freq
/
--max-freq
频率范围过滤
--start
/
--duration
音频的时间切片
--format
输出格式:
jpg
png
-o
输出文件路径

Notes

注意事项

  • WAV and MP3 are decoded natively; other formats require
    ffmpeg
  • Output images can be inspected with
    vision_analyze
    for automated audio analysis
  • Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines
  • WAV和MP3格式可原生解码;其他格式需要
    ffmpeg
  • 输出图片可通过
    vision_analyze
    进行自动化音频分析
  • 可用于对比音频输出、调试合成过程或记录音频处理流程