Search Results: document-extraction

Found 13 Skills

AI & Machine Learningmicrosoft/agent-skills

azure-ai-contentunderstanding-py

Azure AI Content Understanding SDK for Python. Use for multimodal content extraction from documents, images, audio, and video. Triggers: "azure-ai-contentunderstanding", "ContentUnderstandingClient", "multimodal analysis", "document extraction", "video analysis", "audio transcription".

🇺🇸|EnglishTranslated

AI & Machine Learningmicrosoft/agent-skills

azure-ai-document-intelligence-dotnet

Azure AI Document Intelligence SDK for .NET. Extract text, tables, and structured data from documents using prebuilt and custom models. Use for invoice processing, receipt extraction, ID document analysis, and custom document models. Triggers: "Document Intelligence", "DocumentIntelligenceClient", "form recognizer", "invoice extraction", "receipt OCR", "document analysis .NET".

🇺🇸|EnglishTranslated

Document Processingvasilyu1983/ai-agents-pub...

document-pdf

Extract text/tables from PDFs, create formatted PDFs, merge/split/rotate, handle forms and metadata. Supports pdf-lib/pdfkit (Node.js) and pypdf/pdfplumber/ReportLab (Python).

🇺🇸|EnglishTranslated

AI & Machine Learningletta-ai/skills

extracting-pdf-text

Extract text from PDFs for LLM consumption. Use when processing PDFs for RAG, document analysis, or text extraction. Supports API services (Mistral OCR) and local tools (PyMuPDF, pdfplumber). Handles text-based PDFs, tables, and scanned documents with OCR.

🇺🇸|EnglishTranslated

4 scripts/Checked

AI & Machine Learningtensorlakeai/tensorlake-s...

tensorlake

TensorLake SDK for building agentic workflows, sandboxed code execution, and document parsing/extraction. Use when the user mentions tensorlake, or asks about TensorLake APIs/docs/capabilities. Also use when the user is building AI agents or agentic applications that need serverless workflow orchestration (parallel map/reduce DAGs), sandboxed execution of LLM-generated code, or document parsing, structured extraction, and OCR from PDFs/images. Works with any LLM provider (OpenAI, Anthropic), agent framework (LangChain, CrewAI, LlamaIndex), database, or API as the infrastructure layer.

🇺🇸|EnglishTranslated

Document Processinganthropics/financial-serv...

deal-screening

Quickly screen inbound deal flow — CIMs, teasers, and broker materials — against the fund's investment criteria. Extracts key deal metrics, runs a pass/fail framework, and outputs a one-page screening memo. Use when reviewing new deal flow, triaging inbound materials, or deciding whether to take a first call. Triggers on "screen this deal", "review this CIM", "should we look at this", "triage this teaser", or "deal screening".

🇺🇸|EnglishTranslated

Document Processingliyecom/liye-ai

pdf

Comprehensive PDF Operation Tool: Extraction, Merging, Annotation, Form Processing

🇨🇳|ChineseTranslated

Document Processingpspdfkit-labs/nutrient-sk...

pdf-to-markdown

Extract text from PDFs as structured, semantic Markdown. Use when converting a PDF to Markdown, extracting text from a PDF, processing one or more PDFs into Markdown output, reading PDF contents for analysis, ingesting documents for RAG pipelines, preparing PDFs for LLM context, or any task where PDF text needs to be in a machine-readable format. ALWAYS use this skill when the user has a PDF and needs its content as text or Markdown — even if they don't explicitly say "convert to markdown".

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningmrgoonie/claudekit-skills

ai-multimodal

Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.

🇺🇸|EnglishTranslated

6 scripts/Attention

Document Processingokwinds/miscellany

pdf-offline

PDF 文档离线读写与表单处理：提取文本/表格、合并拆分、生成 PDF、填写表单。适用于“本地处理/读取/生成 PDF 文件”（依赖安装可能需要网络）。

🇺🇸|EnglishTranslated

10 scripts/Attention

Document Processingmervinpraison/praisonai

pdf-processing

Process and extract information from PDF documents. Use this skill when the user asks to read, analyze, or extract data from PDF files.

🇺🇸|EnglishTranslated

AI & Machine Learningpaddlepaddle/paddleocr

paddleocr-text-recognition

Extracts text (with locations) from images and PDF documents using PaddleOCR.

🇺🇸|EnglishTranslated

3 scripts/Checked