wispr-analytics

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Wispr Analytics

Wispr 分析

Extract and analyze Wispr Flow dictation history from the local SQLite database. Combine quantitative metrics with LLM-powered qualitative analysis for self-reflection, work pattern recognition, and mental health awareness.
从本地SQLite数据库提取并分析Wispr Flow的听写历史。结合量化指标与LLM驱动的定性分析,用于自我反思、工作模式识别和心理健康认知。

Data Source

数据源

Wispr Flow stores all dictations in SQLite at:
~/Library/Application Support/Wispr Flow/flow.sqlite
Key table:
History
with fields:
formattedText
,
timestamp
,
app
,
numWords
,
duration
,
speechDuration
,
detectedLanguage
,
isArchived
.
The user has ~8,500+ dictations since Feb 2025, bilingual (Russian/English), across apps: iTerm2, ChatGPT, Arc browser, Claude Desktop, Windsurf, Telegram, Obsidian, Perplexity.
Wispr Flow将所有听写内容存储在SQLite数据库中,路径为:
~/Library/Application Support/Wispr Flow/flow.sqlite
关键表:
History
,包含字段:
formattedText
timestamp
app
numWords
duration
speechDuration
detectedLanguage
isArchived
用户自2025年2月以来已有约8500+条听写记录,涵盖俄语/英语双语,涉及应用包括:iTerm2、ChatGPT、Arc浏览器、Claude Desktop、Windsurf、Telegram、Obsidian、Perplexity。

Extraction Script

提取脚本

Run
scripts/extract_wispr.py
to pull data from the database:
bash
undefined
运行
scripts/extract_wispr.py
从数据库中拉取数据:
bash
undefined

Get today's data as JSON with stats + text samples

获取今日数据,以JSON格式返回统计信息和文本样本

python3 scripts/extract_wispr.py --period today --mode all --format json
python3 scripts/extract_wispr.py --period today --mode all --format json

Get markdown stats for the last week

获取过去一周的统计信息,以Markdown格式输出

python3 scripts/extract_wispr.py --period week --format markdown
python3 scripts/extract_wispr.py --period week --format markdown

Get text samples only for LLM analysis

仅获取文本样本用于LLM分析

python3 scripts/extract_wispr.py --period month --mode mental --texts-only
python3 scripts/extract_wispr.py --period month --mode mental --texts-only

Save to file

将结果保存到文件

python3 scripts/extract_wispr.py --period week --format markdown --output /path/to/output.md
undefined
python3 scripts/extract_wispr.py --period week --format markdown --output /path/to/output.md
undefined

Period Options

时间周期选项

  • today
    -- current day (default)
  • yesterday
    -- previous day
  • week
    -- last 7 days
  • month
    -- last 30 days
  • YYYY-MM-DD
    -- specific date
  • YYYY-MM-DD:YYYY-MM-DD
    -- date range
  • today
    -- 当前日期(默认值)
  • yesterday
    -- 前一天
  • week
    -- 过去7天
  • month
    -- 过去30天
  • YYYY-MM-DD
    -- 指定日期
  • YYYY-MM-DD:YYYY-MM-DD
    -- 日期范围

Mode Options

模式选项

  • all
    -- full analysis (default)
  • technical
    -- filters to coding/AI tool dictations
  • soft
    -- filters to communication/writing dictations
  • trends
    -- focus on volume/frequency patterns
  • mental
    -- all text, framed for wellbeing reflection
  • all
    -- 完整分析(默认值)
  • technical
    -- 筛选编码/AI工具相关的听写内容
  • soft
    -- 筛选沟通/写作相关的听写内容
  • trends
    -- 聚焦数量/频率模式
  • mental
    -- 所有文本,从心理健康反思角度分析

Comparison & Graphs

对比与图表

  • --compare
    -- auto-compare with the equivalent previous period (week vs previous week, month vs previous month)
  • --graphs PATH
    -- generate an HTML dashboard with Chart.js graphs (implies --compare). Graphs include: daily words overlay, hourly activity, category breakdown, top apps, language distribution
bash
undefined
  • --compare
    -- 自动与上一同期对比(本周 vs 上周,本月 vs 上月)
  • --graphs PATH
    -- 生成包含Chart.js图表的HTML仪表盘(隐含
    --compare
    参数)。图表包括:每日单词量叠加图、小时活动分布、分类占比、高频应用、语言分布
bash
undefined

Compare this month vs previous month (markdown)

对比本月与上月数据(Markdown格式)

python3 scripts/extract_wispr.py --period month --compare --format markdown
python3 scripts/extract_wispr.py --period month --compare --format markdown

Generate visual dashboard for week comparison

生成本周对比的可视化仪表盘

python3 scripts/extract_wispr.py --period week --compare --graphs /tmp/wispr-week.html
python3 scripts/extract_wispr.py --period week --compare --graphs /tmp/wispr-week.html

Compare and save both markdown + graphs

对比数据并保存Markdown报告和图表

python3 scripts/extract_wispr.py --period month --compare --format markdown --output report.md --graphs report.html
undefined
python3 scripts/extract_wispr.py --period month --compare --format markdown --output report.md --graphs report.html
undefined

Workflow

工作流程

Step 1: Extract Data

步骤1:提取数据

Run the extraction script with the requested period and mode. Use
--format json
for full data or
--texts-only
for LLM analysis focus.
根据请求的时间周期和模式运行提取脚本。使用
--format json
获取完整数据,或使用
--texts-only
聚焦LLM分析需求。

Step 2: Present Quantitative Stats

步骤2:展示量化统计信息

Display the quantitative summary first:
  • Total dictations, words, speech time
  • Category breakdown (coding, ai_tools, communication, writing, other)
  • Language distribution
  • Hourly activity pattern
  • Daily trends (for multi-day periods)
  • Top apps
首先显示量化摘要:
  • 总听写次数、单词数、语音时长
  • 分类占比(编码、AI工具、沟通、写作、其他)
  • 语言分布
  • 小时活动模式
  • 每日趋势(多周期)
  • 高频应用

Step 3: Perform Qualitative Analysis

步骤3:执行定性分析

Read
references/analysis-prompts.md
to load the appropriate analysis template for the requested mode. Then analyze the text samples using that template.
For each mode:
Technical: Focus on what was worked on, technical decisions, context-switching patterns, productivity assessment.
Soft: Focus on communication style shifts, language-switching patterns, audience adaptation, interpersonal dynamics.
Trends: Focus on volume changes, time-of-day shifts, app migration, behavioral change hypotheses.
Mental: Focus on energy proxies, sentiment signals, rumination detection, activity pattern changes. Frame all observations as invitations for self-reflection, never as diagnoses. Use language like "you might notice..." or "this pattern could suggest..."
All: Combine all four perspectives into a unified reflection.
读取
references/analysis-prompts.md
,加载对应模式的分析模板。然后使用该模板分析文本样本。
各模式分析重点:
Technical(技术模式):聚焦工作内容、技术决策、上下文切换模式、生产力评估。
Soft(沟通模式):聚焦沟通风格变化、语言切换模式、受众适配、人际互动动态。
Trends(趋势模式):聚焦数量变化、时段转移、应用迁移、行为变化假设。
Mental(心理健康模式):聚焦能量指标、情绪信号、反刍思维检测、活动模式变化。所有观察结果均以自我反思的邀请形式呈现,绝不作为诊断结论。使用“你可能会注意到……”或“这种模式或许表明……”之类的表述。
All(全模式):整合以上四个视角,形成统一的反思报告。

Step 4: Output

步骤4:输出

Default output location:
meta/wispr-analytics/YYYYMMDD-period-mode.md
in the vault.
File format:
markdown
---
created_date: '[[YYYYMMDD]]'
type: wispr-analytics
period: [period description]
mode: [mode]
---
默认输出位置:知识库中的
meta/wispr-analytics/YYYYMMDD-period-mode.md
文件格式:
markdown
---
created_date: '[[YYYYMMDD]]'
type: wispr-analytics
period: [周期描述]
mode: [模式]
---

Wispr Flow Analytics: [period]

Wispr Flow 分析:[周期]

Quantitative Summary

量化摘要

[stats from Step 2]
[步骤2的统计信息]

Analysis

分析

[qualitative analysis from Step 3]
[步骤3的定性分析内容]

Reflection Prompts

反思提示

[3-5 questions based on observations]

If the user requests console-only output, skip file creation and display directly.
[基于观察结果提出的3-5个问题]

如果用户要求仅在控制台输出,则跳过文件创建,直接显示内容。

App Category Mapping

应用分类映射

The extraction script categorizes apps:
  • coding: iTerm2, cmuxterm, VS Code, Windsurf, Zed, Cursor, Terminal
  • ai_tools: ChatGPT, Claude Desktop, Perplexity, OpenAI Atlas, Codex
  • communication: Telegram, Messages, Slack, Zoom
  • writing: Obsidian, Notes, Chrome, Arc browser
提取脚本会对应用进行分类:
  • coding(编码):iTerm2、cmuxterm、VS Code、Windsurf、Zed、Cursor、Terminal
  • ai_tools(AI工具):ChatGPT、Claude Desktop、Perplexity、OpenAI Atlas、Codex
  • communication(沟通):Telegram、Messages、Slack、Zoom
  • writing(写作):Obsidian、Notes、Chrome、Arc浏览器

Dictionary Management

词典管理

Manage Wispr Flow's dictionary for better recognition accuracy. The dictionary JSON is version-controlled in
~/ai_projects/claude-skills/wispr-analytics/data/dictionary.json
.
管理Wispr Flow的词典以提高识别准确率。词典JSON文件在
~/ai_projects/claude-skills/wispr-analytics/data/dictionary.json
中进行版本控制。

Dictionary Script

词典脚本

Run
scripts/wispr_dictionary.py
for all dictionary operations:
bash
undefined
运行
scripts/wispr_dictionary.py
执行所有词典操作:
bash
undefined

Check database health and dictionary stats

检查数据库健康状态和词典统计信息

python3 scripts/wispr_dictionary.py check
python3 scripts/wispr_dictionary.py check

List all entries (safe while Wispr is running)

列出所有条目(Wispr运行时可安全执行)

python3 scripts/wispr_dictionary.py list python3 scripts/wispr_dictionary.py list --filter "claude"
python3 scripts/wispr_dictionary.py list python3 scripts/wispr_dictionary.py list --filter "claude"

Export dictionary to JSON (safe while running)

将词典导出为JSON(Wispr运行时可安全执行)

python3 scripts/wispr_dictionary.py export
python3 scripts/wispr_dictionary.py export

Suggest new entries by analyzing ASR vs formatted text differences

通过分析ASR与格式化文本的差异,推荐新条目

python3 scripts/wispr_dictionary.py suggest --days 30 --min-freq 3
python3 scripts/wispr_dictionary.py suggest --days 30 --min-freq 3

Add a single term (requires Wispr Flow to be QUIT)

添加单个术语(需要先退出Wispr Flow)

python3 scripts/wispr_dictionary.py add "Gastown" python3 scripts/wispr_dictionary.py add "cloud code" "Claude Code"
python3 scripts/wispr_dictionary.py add "Gastown" python3 scripts/wispr_dictionary.py add "cloud code" "Claude Code"

Remove an entry (requires Wispr Flow to be QUIT)

删除条目(需要先退出Wispr Flow)

python3 scripts/wispr_dictionary.py remove "old term"
python3 scripts/wispr_dictionary.py remove "old term"

Import from JSON (requires Wispr Flow to be QUIT)

从JSON导入词典(需要先退出Wispr Flow)

python3 scripts/wispr_dictionary.py import --dry-run python3 scripts/wispr_dictionary.py import
undefined
python3 scripts/wispr_dictionary.py import --dry-run python3 scripts/wispr_dictionary.py import
undefined

Dictionary Safety Rules

词典安全规则

CRITICAL: Wispr Flow must be quit before any write operations (add, remove, import). The script enforces this automatically. Read operations (export, list, suggest, check) are safe while Wispr is running.
Writing to the SQLite database while Wispr Flow has it open causes index corruption. Always:
  1. Check if Wispr is running:
    pgrep -f "Wispr Flow"
  2. If running, ask user to quit first (Cmd+Q)
  3. After writes, run
    check
    to verify integrity
  4. Restart Wispr Flow
关键注意事项:执行任何写入操作(添加、删除、导入)前必须退出Wispr Flow。脚本会自动强制执行此规则。读取操作(导出、列出、推荐、检查)在Wispr运行时可安全执行。
当Wispr Flow打开数据库时写入SQLite会导致索引损坏。请始终遵循以下步骤:
  1. 检查Wispr是否在运行:
    pgrep -f "Wispr Flow"
  2. 如果正在运行,请要求用户先退出(Cmd+Q)
  3. 写入操作完成后,运行
    check
    验证完整性
  4. 重启Wispr Flow

Dictionary Entry Types

词典条目类型

  • Recognition terms (phrase only): teaches Wispr to hear the word correctly (e.g., "Gastown", "LLM", "subagent")
  • Replacement rules (phrase → replacement): auto-corrects mishears (e.g., "cloud code" → "Claude Code", "клод дизайн" → "Claude Design")
  • Snippets (isSnippet=true): text expansion shortcuts (e.g., "my email" → "glebis@gmail.com")
  • 识别术语(仅短语):教导Wispr正确识别词汇(例如:"Gastown"、"LLM"、"subagent")
  • 替换规则(短语→替换内容):自动修正误识别内容(例如:"cloud code" → "Claude Code"、"клод дизайн" → "Claude Design")
  • 片段(isSnippet=true):文本扩展快捷方式(例如:"my email" → "glebis@gmail.com")

Proactive Dictionary Improvement Workflow

主动优化词典的工作流程

When running analytics, also check for dictionary improvement opportunities:
  1. Run
    suggest
    to find recurring ASR corrections
  2. Compare
    asrText
    vs
    formattedText
    for patterns
  3. Look for Russian/English code-switching mishears
  4. Check for new technical terms the user started using
  5. Export updated dictionary and commit to git
进行分析时,同时检查词典优化机会:
  1. 运行
    suggest
    查找重复出现的ASR修正内容
  2. 对比
    asrText
    formattedText
    的模式差异
  3. 查找俄语/英语代码切换时的误识别情况
  4. 检查用户开始使用的新技术术语
  5. 导出更新后的词典并提交至git

Notes

注意事项

  • For analytics: the database is read-only; analytics never modifies Wispr data
  • For dictionary: writes require Wispr Flow to be quit first
  • Text samples are capped at 100 per extraction to manage context window
  • For multi-day periods, daily trend tables help visualize changes
  • Bilingual dictations are common; analysis should honor both Russian and English
  • The
    asrText
    field contains raw speech recognition before formatting -- useful for detecting speech patterns vs formatted output
  • Dictionary JSON is stored at
    ~/ai_projects/claude-skills/wispr-analytics/data/dictionary.json
    for version control
  • 分析操作:数据库为只读模式;分析绝不会修改Wispr数据
  • 词典操作:写入操作需先退出Wispr Flow
  • 文本样本每次提取上限为100条,以控制上下文窗口大小
  • 对于多日周期,每日趋势表有助于可视化变化
  • 双语听写很常见;分析应同时兼顾俄语和英语
  • asrText
    字段包含格式化前的原始语音识别内容——有助于检测语音模式与格式化输出的差异
  • 词典JSON文件存储在
    ~/ai_projects/claude-skills/wispr-analytics/data/dictionary.json
    中以进行版本控制