document-hub

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Document Hub

Document Hub

处理、生成、发送、转换、编辑 Word、Excel、PDF、Markdown 等文档的统一中心。
A unified hub for processing, generating, sending, converting, and editing documents in formats such as Word, Excel, PDF, Markdown, etc.

When to Use

When to Use

Use This Skill When

Use This Skill When

  • 需要创建Word、Excel、PDF文档
  • 进行文档格式转换(Word↔PDF等)
  • 批量处理多个文档
  • 应用模板生成标准化文档
  • 编辑和修改现有文档内容
  • 提取文档中的结构化数据
  • 多媒体文件格式转换
  • Need to create Word, Excel, PDF documents
  • Perform document format conversion (Word↔PDF, etc.)
  • Batch process multiple documents
  • Generate standardized documents using templates
  • Edit and modify existing document content
  • Extract structured data from documents
  • Media file format conversion

Do NOT Use This Skill If

Do NOT Use This Skill If

  • 需要复杂的排版设计(建议使用专业设计工具)
  • 需要处理大型PDF的精细编辑
  • 目标格式不支持当前内容类型
  • 需要实时协作编辑(建议用在线文档)
  • Need complex layout design (recommend using professional design tools)
  • Need fine-grained editing of large PDFs
  • Target format does not support current content type
  • Need real-time collaborative editing (recommend using online documents)

Typical Trigger Phrases

Typical Trigger Phrases

Chinese:
  • "生成Word文档"
  • "Excel表格处理"
  • "PDF转换"
  • "批量处理文档"
  • "文档格式转换"
  • "创建报告"
English:
  • "Create Word document"
  • "Process Excel file"
  • "Convert to PDF"
  • "Batch document processing"
  • "Document format conversion"
  • "Generate report"
Chinese:
  • "Generate Word document"
  • "Process Excel spreadsheet"
  • "Convert to PDF"
  • "Batch process documents"
  • "Document format conversion"
  • "Create report"
English:
  • "Create Word document"
  • "Process Excel file"
  • "Convert to PDF"
  • "Batch document processing"
  • "Document format conversion"
  • "Generate report"

Workflow

Workflow

Step 1: 确定文档需求

Step 1: Define Document Requirements

  • 目标格式(.docx / .xlsx / .pdf / .md)
  • 内容结构(标题、段落、表格、图片)
  • 是否需要模板
  • Target format (.docx / .xlsx / .pdf / .md)
  • Content structure (titles, paragraphs, tables, images)
  • Whether templates are needed

Step 2: 准备内容数据

Step 2: Prepare Content Data

python
content = {
    "title": "文档标题",
    "paragraphs": ["段落1", "段落2"],
    "tables": [...],
}
python
content = {
    "title": "文档标题",
    "paragraphs": ["段落1", "段落2"],
    "tables": [...],
}

Step 3: 生成/转换文档

Step 3: Generate/Convert Documents

python
from skills.document_hub.document_hub import write, convert
python
from skills.document_hub.document_hub import write, convert

创建文档

创建文档

write("output.docx", content)
write("output.docx", content)

格式转换

格式转换

convert("input.docx", "output.pdf")
undefined
convert("input.docx", "output.pdf")
undefined

Step 4: 验证输出

Step 4: Verify Output

  • 检查文档格式是否正确
  • 确认内容完整性
  • 验证特殊元素(表格、图片)
  • Check if document format is correct
  • Confirm content integrity
  • Verify special elements (tables, images)

Guardrails

Guardrails

Anti-Patterns

Anti-Patterns

  • ❌ 将不支持的格式强行转换
  • ❌ 不验证转换后的文档内容
  • ❌ 批量处理时不做异常捕获
  • ❌ 忽视不同格式的特性限制
  • ❌ Forcibly convert unsupported formats
  • ❌ Fail to verify converted document content
  • ❌ Do not implement exception handling during batch processing
  • ❌ Ignore feature limitations of different formats

Limitations

Limitations

  • 复杂排版能力有限
  • 某些字体可能不兼容
  • 大文件处理速度较慢
  • 部分高级PDF特性不支持
  • Limited complex layout capabilities
  • Some fonts may be incompatible
  • Slow processing speed for large files
  • Some advanced PDF features are not supported

Best Practices

Best Practices

  1. 格式适配: 针对不同格式调整内容结构
  2. 异常处理: 批量操作时捕获单个文件错误
  3. 内容验证: 生成后抽查关键文档
  4. 模板复用: 建立标准化文档模板
  1. Format Adaptation: Adjust content structure for different formats
  2. Exception Handling: Capture individual file errors during batch operations
  3. Content Verification: Spot-check key documents after generation
  4. Template Reuse: Establish standardized document templates

Core Functions

Core Functions

文档创建

Document Creation

python
from skills.document_hub.document_hub import write
python
from skills.document_hub.document_hub import write

Word文档

Word文档

write("document.docx", { "title": "标题", "paragraphs": ["内容段落1", "内容段落2"] })
write("document.docx", { "title": "标题", "paragraphs": ["内容段落1", "内容段落2"] })

Excel表格

Excel表格

write("spreadsheet.xlsx", { "sheets": { "Sheet1": { "data": [ {"列A": "值1", "列B": "值2"}, {"列A": "值3", "列B": "值4"} ] } } })
undefined
write("spreadsheet.xlsx", { "sheets": { "Sheet1": { "data": [ {"列A": "值1", "列B": "值2"}, {"列A": "值3", "列B": "值4"} ] } } })
undefined

格式转换

Format Conversion

python
from skills.document_hub.document_hub import convert
python
from skills.document_hub.document_hub import convert

Word to PDF

Word to PDF

convert("input.docx", "output.pdf")
convert("input.docx", "output.pdf")

PDF to images (via hub)

PDF to images (via hub)

from skills.document_hub.document_hub import get_hub hub = get_hub()
undefined
from skills.document_hub.document_hub import get_hub hub = get_hub()
undefined

批量处理

Batch Processing

python
from skills.document_hub.document_hub import batch_process

files = ["doc1.docx", "doc2.docx", "doc3.docx"]
batch_process(files, operation="convert", target_format="pdf")
python
from skills.document_hub.document_hub import batch_process

files = ["doc1.docx", "doc2.docx", "doc3.docx"]
batch_process(files, operation="convert", target_format="pdf")

媒体转换

Media Conversion

python
from skills.document_hub.document_hub import get_hub

hub = get_hub()
python
from skills.document_hub.document_hub import get_hub

hub = get_hub()

视频转音频

视频转音频

hub.convert_media("video.mp4", "audio.mp3")
undefined
hub.convert_media("video.mp4", "audio.mp3")
undefined

Supported Formats

Supported Formats

操作支持格式说明
创建.docx, .xlsxWord, Excel
转换.docx ↔ .pdfWord与PDF互转
媒体.mp4 ↔ .mp3视频音频转换
读取.docx, .xlsx提取内容和数据
OperationSupported FormatsDescription
Creation.docx, .xlsxWord, Excel
Conversion.docx ↔ .pdfMutual conversion between Word and PDF
Media.mp4 ↔ .mp3Video to audio conversion
Reading.docx, .xlsxExtract content and data

Integration Examples

Integration Examples

Workflow 1: 内容提取 → 生成Word

Workflow 1: Content Extraction → Generate Word

python
from skills.content_extractor.content_extractor import extract
from skills.document_hub.document_hub import write

result = extract("https://mp.weixin.qq.com/s/xxx")

doc_content = {
    "title": result.title,
    "paragraphs": [
        f"作者:{result.author}",
        f"发布时间:{result.publish_time}",
        "",
        result.content
    ]
}
write("文章.docx", doc_content)
python
from skills.content_extractor.content_extractor import extract
from skills.document_hub.document_hub import write

result = extract("https://mp.weixin.qq.com/s/xxx")

doc_content = {
    "title": result.title,
    "paragraphs": [
        f"作者:{result.author}",
        f"发布时间:{result.publish_time}",
        "",
        result.content
    ]
}
write("文章.docx", doc_content)

Workflow 2: 数据汇总 → Excel

Workflow 2: Data Aggregation → Excel

python
from skills.document_hub.document_hub import write

data = {
    "sheets": {
        "汇总": {
            "data": [
                {"平台": "小宇宙", "标题": "播客1"},
                {"平台": "B站", "标题": "视频1"}
            ]
        }
    }
}
write("内容汇总.xlsx", data)
python
from skills.document_hub.document_hub import write

data = {
    "sheets": {
        "汇总": {
            "data": [
                {"平台": "小宇宙", "标题": "播客1"},
                {"平台": "B站", "标题": "视频1"}
            ]
        }
    }
}
write("内容汇总.xlsx", data)

Workflow 3: Word → PDF

Workflow 3: Word → PDF

python
from skills.document_hub.document_hub import convert

convert("报告.docx", "报告.pdf")
python
from skills.document_hub.document_hub import convert

convert("报告.docx", "报告.pdf")

Related Skills

Related Skills

SkillRelationshipUse Case
pdf专业补充复杂的PDF读取和处理
content-extractor上游输入提取网络内容生成文档
email-sender下游分发将文档作为邮件附件发送
long-form-writer内容生成生成长文后导出文档
md-to-wechat格式转换Markdown转公众号HTML
image-ocr辅助识别提取图片文字到文档
SkillRelationshipUse Case
pdfProfessional SupplementComplex PDF reading and processing
content-extractorUpstream InputExtract web content to generate documents
email-senderDownstream DistributionSend documents as email attachments
long-form-writerContent GenerationExport documents after generating long-form content
md-to-wechatFormat ConversionConvert Markdown to WeChat Official Account HTML
image-ocrAuxiliary RecognitionExtract text from images into documents

About UniqueClub

About UniqueClub

Part of the UniqueClub toolkit - a collection of skills for AI-powered content creation and automation.
Part of the UniqueClub toolkit - a collection of skills for AI-powered content creation and automation.