Loading...
Loading...
Found 124 Skills
Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
Transcribe audio to text using Sarvam AI's Saaras model. Handles speech recognition, transcription, and voice interfaces for 23 Indian languages. Supports 5 output modes, auto language detection, WebSocket streaming, and batch diarization. Use when converting speech to text or building voice-enabled apps.
Understand video content locally using ffmpeg frame extraction and Whisper transcription. No API keys needed. Use when: (1) Understanding what a video contains, (2) Transcribing video audio locally, (3) Extracting key frames for visual analysis, (4) Getting video content without API keys.
Refine speech transcription texts (interviews, speeches, podcasts, meetings) into more readable article paragraphs. Trigger this skill when users mention terms like "subtitle refinement", "transcript polish", "subtitle polishing", "organize video subtitles into articles", "interview text organization", processing interview records, transcription text optimization, speech-to-text organization, or when they need to organize long dialogue/speech texts into readable articles. It is suitable for organizing transcription texts of solo speeches or multi-person conversations, requiring the retention of original sentences and words, and rejecting high-level generalization. This skill should also be triggered even if users only say "help me organize this text" and attach obviously colloquial text.
Shadow platform help — bot-free AI meeting assistant capturing audio + screen on macOS, on-device transcription, autopilot meeting detection, AI summaries/action items/follow-up emails, Skills system for custom post-meeting tasks. Use when setting up Shadow for the first time, Shadow not detecting meetings automatically, Shadow using too much CPU or memory on Mac, Shadow speaker attribution is wrong, Shadow screen capture not working, Shadow free tier ran out of AI meetings, choosing between Shadow and Granola or Jamie or Bluedot for bot-free recording, or exporting Shadow notes to Markdown or Zapier. Do NOT use for choosing between all AI note-takers (use /sales-note-taker) or reviewing a call for coaching (use /sales-call-review).
Spoken video transcription and slip-of-the-tongue recognition. Generate review drafts and deletion task checklists. Trigger phrases: edit spoken video, process video, recognize slip-of-the-tongue
Z.ai API integration for building applications with GLM models. Use when working with Z.ai/ZhipuAI APIs for: (1) Chat completions with GLM-4.7/4.6/4.5 models, (2) Vision/multimodal tasks with GLM-4.6V, (3) Image generation with GLM-Image or CogView-4, (4) Video generation with CogVideoX-3 or Vidu models, (5) Audio transcription with GLM-ASR-2512, (6) Function calling and tool use, (7) Web search integration, (8) Translation, slide/poster generation agents. Triggers: Z.ai, ZhipuAI, GLM, BigModel, Zhipu, CogVideoX, CogView, Vidu.
Convert documents (PDF, Word, Excel, PowerPoint, images, HTML) to Markdown using microsoft/markitdown. Use for document analysis, content extraction, preprocessing for LLMs, or batch document conversion. Supports images with OCR/LLM descriptions, audio transcription, and ZIP archives.
ML-powered Karaoke app in Rust using Bevy, WhisperX, and Demucs for stem separation, lyrics transcription, and pitch scoring.
Scribbl platform help — bot-free AI meeting notes Chrome extension for Google Meet with instant summaries, action items, and AI Copilot chat. Use when setting up Scribbl for automatic meeting recording and transcription on Google Meet, troubleshooting Scribbl Chrome extension not recording or transcription missing, configuring team sharing and meeting library organization with collections, deciding between Scribbl free and Pro plans or evaluating whether Team plan CRM integrations are worth it, comparing Scribbl to other bot-free note-takers like Tactiq or Granola, or wondering why Scribbl only works on Google Meet and when Zoom and Teams support is coming. Do NOT use for comparing AI note-takers across all platforms (use /sales-note-taker) or reviewing a sales call for coaching (use /sales-call-review).
Analyzes genetic variant effects on gene expression (RNA-seq), chromatin accessibility (DNASE), histone marks (ChIP), and transcription factors using the AlphaGenome API. Use when the user asks about non-coding variant effects, pathogenicity, clinical significance, disease associations, functional effects, gene expression changes, splicing disruption, or regulatory effects in promoters and enhancers. Also use for resolving biological terms to tissue/cell-type ontologies (UBERON/CL) or analyzing variants in chr:pos:ref>alt format.
Build real-time conversational AI voice engines using async worker pipelines, streaming transcription, LLM agents, and TTS synthesis with interrupt handling and multi-provider support