Total 30,622 skills, Document Processing has 447 skills
Showing 12 of 447 skills
Comprehensive PDF processing API for conversion, merge, split, compress, OCR, and more
Remove repeated boilerplate across sections (methodology disclaimers, generic transitions, repeated summaries) while preserving citations and meaning. **Trigger**: redundancy, repetition, boilerplate removal, 去重复, 去套话, 合并重复段落. **Use when**: the draft feels rigid because the same paragraph shape and disclaimer repeats across many subsections. **Skip if**: you are still drafting major missing sections (finish drafting first). **Network**: none. **Guardrail**: do not add/remove citation keys; do not move citations across subsections; do not delete subsection-specific content.
Initialize and manage specification directories with auto-incrementing IDs. Use when creating new specs, checking spec status, tracking user decisions, or managing the docs/specs/ directory structure. Maintains README.md in each spec to record decisions (e.g., PRD skipped), context, and progress. Orchestrates the specification workflow across PRD, SDD, and PLAN phases.
Lint and fix Markdown files using markdownlint-cli2. Use when asked to: lint, fix, format, or clean up Markdown (.md) files; enforce Markdown best practices or style standards; resolve markdownlint errors or warnings. Triggers on requests mentioning markdownlint, markdown linting, or formatting Markdown to best standards.
Clean and reconstruct raw auto-generated captions (Zoom, YouTube, Teams, Google Meet, Otter.ai, etc.) into readable, coherent transcripts. Use when the user provides raw caption files (.txt, .vtt, .srt), meeting transcripts with timestamps and speaker tags, or asks to clean up/refine a transcript. Handles: timestamp removal, speaker tag normalization, filler word removal, broken sentence reconstruction, transcription error correction, paragraph formation. Preserves every piece of substantive content while removing noise. Trigger phrases: 'clean this transcript', 'refine captions', 'fix this transcript', 'process Zoom captions', 'clean up meeting notes'.
Convert markdown to beautifully styled Word documents using custom templates. Supports branded fonts, colors, and table styling. Extract styles from existing docs or generate fresh templates.
Deterministically merge per-section files under `sections/` into `output/DRAFT.md`, preserving outline order and weaving transitions from `outline/transitions.md`. **Trigger**: merge sections, merge draft, combine section files, sections/ -> output/DRAFT.md, 合并小节, 拼接草稿. **Use when**: you have per-unit prose files under `sections/` and want a single `output/DRAFT.md` for polishing/review/LaTeX. **Skip if**: section files are missing or still contain scaffolding markers (fix `subsection-writer` first). **Network**: none. **Guardrail**: deterministic merge only (no new facts/citations); preserve section order from `outline/outline.yml`.
Convert PDFs to Markdown using Mistral OCR API with image extraction. Use when you need to extract structured text and images from PDFs, especially for scanned documents or documents with complex formatting. Outputs Markdown with embedded images.
Rewrite `outline/claim_evidence_matrix.md` as a projection/index of evidence packs (NO PROSE), so claims/axes are driven by `outline/evidence_drafts.jsonl` rather than outline placeholders. **Trigger**: claim matrix rewriter, rewrite claim-evidence matrix, evidence-first claim matrix, matrix index, 证据矩阵重写, 从证据包生成矩阵. **Use when**: `outline/subsection_briefs.jsonl` + `outline/evidence_drafts.jsonl` are ready and you want a clean claim→evidence index for QA/writing. **Skip if**: `outline/claim_evidence_matrix.md` is already refined and consistent with evidence packs. **Network**: none. **Guardrail**: NO PROSE; do not invent facts; only cite keys present in `citations/ref.bib`; if evidence is abstract/title-only, claims must be provisional.
Comprehensive patterns and techniques for removing AI-generated verbosity and slop
Expert in creating, editing, and automating Word documents (.docx) using python-docx and docx.js. Use when generating Word documents, modifying existing docx files, or automating document workflows.
Toolkit for comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. Use to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks