pandoc-converter

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Pandoc Converter

Pandoc 格式转换器

Convert documents between common formats using Pandoc.
使用Pandoc实现常见文档格式之间的互转。

Quick Start

快速开始

bash
undefined
bash
undefined

Basic conversion (format auto-detected from extensions)

基础转换(格式从文件扩展名自动识别)

python scripts/convert.py input.md output.docx
python scripts/convert.py input.md output.docx

Specify output format only

仅指定输出格式

python scripts/convert.py document.md --to html
python scripts/convert.py document.md --to html

Check if Pandoc is installed

检查Pandoc是否已安装

python scripts/convert.py --check
python scripts/convert.py --check

List supported formats

列出支持的格式

python scripts/convert.py --formats
undefined
python scripts/convert.py --formats
undefined

Supported Formats

支持的格式

Document formats (read/write): markdown, html, docx, latex, epub, rtf, pptx, pdf
Data formats (read only): csv, tsv, xlsx
Markdown variants: gfm (GitHub), commonmark
For detailed compatibility, see
references/format-compatibility.md
.
文档格式(可读可写): markdown, html, docx, latex, epub, rtf, pptx, pdf
数据格式(仅可读): csv, tsv, xlsx
Markdown变体: gfm(GitHub风格), commonmark
如需了解详细兼容性,请查看
references/format-compatibility.md

Common Conversions

常见转换示例

FromToCommand
MarkdownWord
python scripts/convert.py doc.md doc.docx
MarkdownPDF
python scripts/convert.py doc.md doc.pdf
MarkdownHTML
python scripts/convert.py doc.md doc.html
WordMarkdown
python scripts/convert.py doc.docx doc.md
CSVHTML table
python scripts/convert.py data.csv data.html
LaTeXPDF
python scripts/convert.py paper.tex paper.pdf
源格式目标格式命令
MarkdownWord
python scripts/convert.py doc.md doc.docx
MarkdownPDF
python scripts/convert.py doc.md doc.pdf
MarkdownHTML
python scripts/convert.py doc.md doc.html
WordMarkdown
python scripts/convert.py doc.docx doc.md
CSVHTML表格
python scripts/convert.py data.csv data.html
LaTeXPDF
python scripts/convert.py paper.tex paper.pdf

Options

可选参数

OptionDescription
--from <fmt>
Override input format detection
--to <fmt>
Specify output format (if no output file)
--standalone
Include document headers/footers
--toc
Add table of contents
--pdf-engine <eng>
PDF engine: pdflatex, xelatex, lualatex
Additional Pandoc options pass through directly.
参数说明
--from <fmt>
覆盖自动检测的输入格式
--to <fmt>
指定输出格式(当未提供输出文件时)
--standalone
包含文档页眉/页脚
--toc
添加目录
--pdf-engine <eng>
PDF渲染引擎:pdflatex, xelatex, lualatex
其他Pandoc参数可直接传递使用。

Workflow

操作流程

  1. Check installation: Run
    python scripts/convert.py --check
  2. If not installed: Follow the installation instructions provided
  3. Convert: Run the conversion with input and output files
  4. Present result: Provide the converted file to the user
  1. 检查安装情况: 运行
    python scripts/convert.py --check
  2. 若未安装: 按照提示的安装步骤操作
  3. 执行转换: 运行转换命令并指定输入和输出文件
  4. 交付结果: 将转换后的文件提供给用户

Installation (if needed)

安装步骤(如需)

The script provides installation guidance, but here's a summary:
bash
undefined
脚本会提供安装指引,以下是简要说明:
bash
undefined

macOS

macOS系统

brew install pandoc
brew install pandoc

Ubuntu/Debian

Ubuntu/Debian系统

sudo apt-get install pandoc
sudo apt-get install pandoc

For PDF output, also install LaTeX:

如需生成PDF输出,还需安装LaTeX:

macOS: brew install --cask mactex-no-gui

macOS: brew install --cask mactex-no-gui

Ubuntu: sudo apt-get install texlive-xetex

Ubuntu: sudo apt-get install texlive-xetex

undefined
undefined

Limitations

限制说明

  • CSV/TSV/XLSX: Input only (converts to tables in other formats)
  • PDF output: Requires LaTeX installation
  • PPTX: Text extraction works; complex layouts may simplify
  • Complex formatting: Some features may not transfer between formats
  • CSV/TSV/XLSX: 仅支持作为输入格式(会转换为其他格式中的表格)
  • PDF输出: 需要安装LaTeX环境
  • PPTX: 可提取文本内容;复杂布局可能会被简化
  • 复杂格式: 部分格式特性可能无法在转换过程中完整保留