docx-to-markdown
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesedocx-to-markdown
DOCX转Markdown
Convert Microsoft Word (.docx) documents to Markdown format.
将Microsoft Word(.docx)文档转换为Markdown格式。
Installation Required
需先安装依赖
bash
cd .claude/skills/docx-to-markdown
npm installDependencies: , ,
mammothturndown@truto/turndown-plugin-gfmbash
cd .claude/skills/docx-to-markdown
npm install依赖项: , ,
mammothturndown@truto/turndown-plugin-gfmQuick Start
快速开始
bash
undefinedbash
undefinedBasic conversion
基础转换
node .claude/skills/docx-to-markdown/scripts/convert.cjs
--file ./document.docx
--file ./document.docx
node .claude/skills/docx-to-markdown/scripts/convert.cjs
--file ./document.docx
--file ./document.docx
Custom output path
自定义输出路径
node .claude/skills/docx-to-markdown/scripts/convert.cjs
--file ./doc.docx
--output ./output/doc.md
--file ./doc.docx
--output ./output/doc.md
node .claude/skills/docx-to-markdown/scripts/convert.cjs
--file ./doc.docx
--output ./output/doc.md
--file ./doc.docx
--output ./output/doc.md
Extract images to directory
将图片提取到指定目录
node .claude/skills/docx-to-markdown/scripts/convert.cjs
--file ./doc.docx
--output ./output/doc.md
--images ./output/images/
--file ./doc.docx
--output ./output/doc.md
--images ./output/images/
undefinednode .claude/skills/docx-to-markdown/scripts/convert.cjs
--file ./doc.docx
--output ./output/doc.md
--images ./output/images/
--file ./doc.docx
--output ./output/doc.md
--images ./output/images/
undefinedCLI Options
CLI选项
| Option | Required | Description |
|---|---|---|
| Yes | Input DOCX file |
| No | Output Markdown path (default: input name + .md) |
| No | Directory for extracted images (default: inline base64) |
| 选项 | 是否必填 | 描述 |
|---|---|---|
| 是 | 输入的DOCX文件路径 |
| 否 | 输出Markdown文件的路径(默认:输入文件名+.md) |
| 否 | 提取图片的存储目录(默认:转为base64嵌入文本) |
Output Format (JSON)
输出格式(JSON)
json
{
"success": true,
"input": "/path/to/input.docx",
"output": "/path/to/output.md",
"wordCount": 1523,
"images": 5,
"warnings": ["Some formatting may be simplified"]
}json
{
"success": true,
"input": "/path/to/input.docx",
"output": "/path/to/output.md",
"wordCount": 1523,
"images": 5,
"warnings": ["Some formatting may be simplified"]
}Supported Elements
支持的元素
- Headings (H1-H6)
- Paragraphs and emphasis (bold, italic, strikethrough)
- Ordered and unordered lists
- Tables (GFM format)
- Links
- Images (extracted or base64)
- Code blocks (requires Word "Code" style)
- Blockquotes
- 标题(H1-H6)
- 段落和强调格式(粗体、斜体、删除线)
- 有序和无序列表
- 表格(GFM格式)
- 链接
- 图片(可提取或转为base64)
- 代码块(需要使用Word "Code"样式)
- 块引用
Known Limitations
已知限制
- Nested lists: Numbering may reset in deeply nested lists
- Nested tables: Inner tables are flattened
- Code blocks: Require explicit Word style mapping ("Code" or "Code Block")
- Complex formatting: Some advanced formatting may be simplified
- Footnotes: Converted but may lose some formatting
- 嵌套列表:深度嵌套的列表可能会出现编号重置问题
- 嵌套表格:内部表格会被扁平化处理
- 代码块:需要显式映射Word样式("Code"或"Code Block")
- 复杂格式:部分高级格式可能会被简化
- 脚注:会被转换,但可能丢失部分格式
Google Docs Support
Google Docs支持
Export your Google Doc as DOCX first, then convert:
- In Google Docs: File → Download → Microsoft Word (.docx)
- Run this converter on the downloaded file
先将Google文档导出为DOCX格式,再进行转换:
- 在Google Docs中:文件 → 下载 → Microsoft Word(.docx)
- 使用本转换器处理下载的文件
Troubleshooting
问题排查
Dependencies not found: Run in skill directory
Empty output: Ensure DOCX contains actual text (not just images)
Code blocks not detected: Use Word's built-in "Code" style
npm install未找到依赖项: 在技能目录下运行
输出为空: 确保DOCX文件包含实际文本(不只是图片)
未检测到代码块: 使用Word内置的"Code"样式
npm installIMPORTANT Task Planning Notes
重要任务规划注意事项
- Always plan and break many small todo tasks
- Always add a final review todo task to review the works done at the end to find any fix or enhancement needed
- 始终将任务拆分为多个小的待办事项
- 始终添加一个最终审核的待办任务,在最后检查已完成的工作,找出需要修复或优化的地方