Loading...
Loading...
Found 47 Skills
Manipulate PDF documents programmatically. Merge, split, rotate, and watermark PDFs. Extract text and metadata. Handle form filling and encryption/decryption.
Process PDF files for text extraction, form filling, and document analysis. Use when you need to extract content from PDFs, fill forms, or analyze document structure.
Extract text from images using OCR. Use when the user needs to read text from screenshots, photos, or image files.
Extract text from PDF files, translate it to a target language, and save the result as a Markdown file. Use this skill when the user wants to translate a PDF document.
Extract text from source documents (PDF, DOCX, PPTX, HTML, Markdown) for spreadsheet workflows. Use to understand source material before populating workbooks.
Integrate with HyperAPI for financial document processing - OCR text extraction, document classification, PDF splitting, and structured data extraction from invoices, receipts, and financial documents. Use when the user needs to parse PDFs, extract text from documents, classify document types, split multi-document PDFs, or extract structured entities like invoice numbers, vendor names, line items. Keywords: hyperapi, hyperbots, document parsing, OCR, PDF processing, invoice extraction, receipt processing, document classification, VLM, vision language model.
Convert documents to Markdown using markitdown. Use when you need to extract text and convert PDF, Word, PowerPoint, Excel, HTML, CSV, JSON, XML, images (with EXIF/OCR), audio, ZIP archives, YouTube URLs, or EPUBs to Markdown format for LLM processing or text analysis.
Extract text and metadata from PDF files using pdf-parse. Use when: user uploads a PDF or asks to read/analyze PDF content. NOT for: creating PDFs, editing PDFs, or OCR on scanned documents.
Download YouTube video transcripts (subtitles/captions) using yt-dlp. Use this skill whenever the user provides a YouTube URL and wants the transcript, asks to "download transcript", "get captions/subtitles", or "transcribe a YouTube video". Also triggers when user needs text content extracted from any YouTube video, even if they don't explicitly say "transcript" (e.g., "what does this video say", "get me the text from this video", "I need the content of this YouTube link").
High-precision Optical Character Recognition (OCR) service. Supports text detection and extraction for multi-language, multi-format images, and provides text area coordinates and confidence scores, suitable for document digitization and image content analysis.
Comprehensive PowerPoint presentation creation, editing, and analysis using OOXML manipulation including slides, layouts, speaker notes, comments, and formatting. Use when asked to "create a presentation", "edit this PowerPoint", "add slides to .pptx", "extract presentation text", or "analyze slide structure". Provides raw OOXML access for advanced formatting, python-pptx for programmatic slide generation, and markitdown for text extraction. Works with .pptx files through ZIP archive extraction and XML manipulation for professional presentation workflows.