Search Results: ocr

Found 169 Skills

AI & Machine Learningskinnyandbald/fish-skills

interview-me

Socratic thinking partner that refines half-baked ideas into clear product or technical specifications through iterative questioning. Use when you have a vague concept, feature idea, or problem statement and need structured clarification before building.

🇺🇸|EnglishTranslated

Document Processingfuzhiyu/researchprojectte...

mistral-pdf-to-markdown

Convert PDFs to Markdown using Mistral OCR API with image extraction. Use when you need to extract structured text and images from PDFs, especially for scanned documents or documents with complex formatting. Outputs Markdown with embedded images.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningwesley1600/claudecodefram...

vision

Analyzes and processes images using Claude's vision capabilities. Supports OCR, image classification, diagram comparison, chart analysis, visual Q&A, and more. Use when users need to understand, extract, or analyze visual content.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningsanyuan0704/code-review-e...

sigma

Personalized 1-on-1 AI tutor using Bloom's 2-Sigma mastery learning. Guides users through any topic with Socratic questioning, adaptive pacing, and rich visual output (HTML dashboards, Excalidraw concept maps, generated images). Use when user wants to learn something, study a topic, understand a concept, requests tutoring, says 'teach me', 'I want to learn', 'explain X to me step by step', 'help me understand', or invokes /sigma. Triggers on: learn, study, teach, tutor, understand, master, explain step by step.

🇺🇸|EnglishTranslated

AI & Machine Learningyeachan-heo/oh-my-claudec...

deep-interview

Socratic deep interview with mathematical ambiguity gating before autonomous execution

🇺🇸|EnglishTranslated

AI & Machine Learningcountbot-ai/countbot

image-analysis

图片分析与识别，可分析本地图片、网络图片、视频、文件。适用于 OCR、物体识别、场景理解等。当用户发送图片或要求分析图片时必须使用此技能。

🇺🇸|EnglishTranslated

2 scripts/Checked

AI & Machine Learningqwencloud/qwencloud-ai

qwencloud-vision

[QwenCloud] Understand images and videos with Qwen vision models. TRIGGER when: user wants to analyze, describe, or extract information from images or videos, OCR text extraction, chart/table reading, visual reasoning, multi-image comparison, screenshot understanding, video comprehension, or explicitly invokes this skill by name (e.g. use qwencloud-vision). DO NOT TRIGGER when: user wants to generate/create images (use qwencloud-image-generation), generate videos (use qwencloud-video-generation), text-only tasks without visual input, or non-Qwen vision tasks.

🇺🇸|EnglishTranslated

6 scripts/Checked

Tools & Utilitiesmalue-ai/dazee-small

screenpipe

AI screen memory — search everything you've seen or heard on your computer. Integrates with Screenpipe's local MCP server for OCR text, audio transcripts, and app usage history.

🇨🇳|ChineseTranslated

AI & Machine Learningxixiaofinland/agent-skill...

book-tutor

Socratic book-learning tutor for any book or course. Teaches chapter-by-chapter using guided questioning, ~200-word explanations, and comprehension checks. Tracks progress and writes durable concept notes to a vault. Reads book config from project CLAUDE.md. Use when the user says "chapter N", "let's study", "teach me X", or when a project CLAUDE.md declares a learning context.

🇺🇸|EnglishTranslated

Tools & Utilitiesletta-ai/skills

code-from-image

Guide for extracting code or pseudocode from images using OCR and implementing it correctly. This skill should be used when tasks involve reading code, pseudocode, or algorithms from images (PNG, JPG, screenshots) and executing or implementing the extracted logic.

🇺🇸|EnglishTranslated

Tools & Utilitiesfamaoai-creator/gemini-sk...

doc-to-text

Extract text content from various file formats. Supports PDF, Excel, Word, Images (OCR), Email, and ZIP Archives. Use for summarizing or analyzing binary files.

🇺🇸|EnglishTranslated

1 scripts/Checked

Document Processingmicrock/ordinary-claude-s...

markitdown

Convert various file formats (PDF, Office documents, images, audio, web content, structured data) to Markdown optimized for LLM processing. Use when converting documents to markdown, extracting text from PDFs/Office files, transcribing audio, performing OCR on images, extracting YouTube transcripts, or processing batches of files. Supports 20+ formats including DOCX, XLSX, PPTX, PDF, HTML, EPUB, CSV, JSON, images with OCR, and audio with transcription.

🇺🇸|EnglishTranslated