markitdown

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

MarkItDown

MarkItDown

Convert a document, image, audio file, or YouTube URL to Markdown using Microsoft's
markitdown
CLI. The skill validates the input, composes the right flags, optionally saves the result under
~/.claude/output/<project>/markitdown/<slug>/
, and reports a one-line summary with the fully-expanded absolute path (no tilde, no magic).
The deterministic work — install check, validation, slug derivation, save path, command composition — happens in
scripts/markitdown.sh
. The skill parses
$ARGUMENTS
, hands them to the script, and turns the script's
RESULT:
lines into a human report.
使用微软的
markitdown
CLI将文档、图片、音频文件或YouTube URL转换为Markdown格式。该工具会验证输入内容、组合正确的参数,可选择将结果保存至
~/.claude/output/<project>/markitdown/<slug>/
路径下,并生成包含完整绝对路径的单行摘要(无波浪号、无特殊路径)。
确定性工作——安装检查、验证、slug生成、保存路径、命令组合——均在
scripts/markitdown.sh
中完成。该工具会解析
$ARGUMENTS
,将其传递给脚本,并将脚本的
RESULT:
行转换为易读的报告。

Install

安装

bash
pip install 'markitdown[all]'
For a smaller install, pick only what you need:
GroupAdds
[pdf]
PDF parsing
[docx]
Word documents
[pptx]
PowerPoint
[xlsx]
[xls]
Excel
[outlook]
Outlook
.msg
[audio-transcription]
MP3/WAV via local Whisper
[youtube-transcription]
YouTube transcripts
[az-doc-intel]
Azure Document Intelligence backend
For Azure Document Intelligence, also export
MARKITDOWN_DOCINTEL_ENDPOINT=https://<resource>.cognitiveservices.azure.com/
before invoking with
-d
.
bash
pip install 'markitdown[all]'
若要精简安装,可仅选择所需组件:
分组新增功能
[pdf]
PDF解析
[docx]
Word文档处理
[pptx]
PowerPoint处理
[xlsx]
[xls]
Excel处理
[outlook]
Outlook
.msg
文件处理
[audio-transcription]
通过本地Whisper实现MP3/WAV音频转录
[youtube-transcription]
YouTube字幕提取
[az-doc-intel]
Azure Document Intelligence后端支持
使用Azure Document Intelligence时,需先导出环境变量
MARKITDOWN_DOCINTEL_ENDPOINT=https://<resource>.cognitiveservices.azure.com/
,再使用
-d
参数调用。

Parameters

参数

FlagDefaultEffect
-s
offSave Markdown to
~/.claude/output/<project>/markitdown/<slug>/<stem>.md
-S
offForce no-save (override an ambient save mode)
-d
offUse Azure Document Intelligence (needs
MARKITDOWN_DOCINTEL_ENDPOINT
)
-p
offEnable installed third-party
markitdown
plugins
-k
offKeep data URIs (base64 images) inline in the output
-l
List installed plugins and exit
Output saved under
~/.claude/output/{project}/markitdown/{slug}/
, where
{project}
is the kebab-cased basename of the git toplevel (else cwd) and
{slug}
is a kebab of the input basename (≤5 words). Pipeline-friendly — typical downstream:
/forge -s -f <path>
decomposes the extracted content into workstreams;
/apex -f <path>
implements from it; any skill accepting
-f
can consume.
参数默认值作用
-s
关闭将Markdown保存至
~/.claude/output/<project>/markitdown/<slug>/<stem>.md
-S
关闭强制不保存(覆盖全局保存模式)
-d
关闭使用Azure Document Intelligence(需配置
MARKITDOWN_DOCINTEL_ENDPOINT
-p
关闭启用已安装的第三方
markitdown
插件
-k
关闭在输出中保留数据URI(base64图片)
-l
列出已安装的插件并退出
输出文件保存于
~/.claude/output/{project}/markitdown/{slug}/
,其中
{project}
是git顶层目录的短横线命名形式(若不存在则为当前工作目录),
{slug}
是输入文件basename的短横线命名形式(最多5个词)。支持流水线操作——典型下游操作:
/forge -s -f <path>
将提取的内容分解为工作流;
/apex -f <path>
基于内容执行操作;任何支持
-f
参数的工具均可调用该输出。

Workflow

工作流程

  1. Empty
    $ARGUMENTS
    → propose the most recent non-Markdown target from session context (file or URL) and confirm. Ask only when none is detectable.
  2. Run the helper:
    bash
    bash ${CLAUDE_SKILL_DIR}/scripts/markitdown.sh $ARGUMENTS
  3. The script emits
    RESULT: key=value
    lines — keys:
    bytes
    ,
    slug
    ,
    saved
    , plus
    path
    when saving (order is not guaranteed; parse by key) — followed either by the converted Markdown (no-save mode, after a
    ---
    separator) or nothing (save mode — the file is on disk).
  4. Parse the
    RESULT:
    lines and produce the report below.
  5. If the script exits with
    ERR: markitdown not installed
    (exit 127) → print the install command from
    ## Install
    and stop. Never auto-install on the user's behalf.
  6. If the script exits with another
    ERR:
    (file not found, missing endpoint, unknown flag) → relay the message verbatim and stop.
  1. $ARGUMENTS
    为空
    → 从会话上下文推荐最近的非Markdown目标(文件或URL)并确认。仅当无法检测到目标时才询问用户。
  2. 运行辅助脚本:
    bash
    bash ${CLAUDE_SKILL_DIR}/scripts/markitdown.sh $ARGUMENTS
  3. 脚本会输出
    RESULT: key=value
    行——键包括
    bytes
    slug
    saved
    ,保存时还会包含
    path
    (顺序不固定;需按键解析)——之后要么输出转换后的Markdown(不保存模式,以
    ---
    分隔),要么无输出(保存模式——文件已存储在磁盘上)。
  4. 解析
    RESULT:
    行并生成下方报告。
  5. 若脚本退出时提示
    ERR: markitdown not installed
    (退出码127)→ 打印
    ## 安装
    部分的安装命令并停止。切勿自动为用户安装。
  6. 若脚本退出时提示其他
    ERR:
    (文件未找到、缺少端点、未知参数)→ 直接转发错误信息并停止。

Output

输出

markitdown: <input> → <bytes> bytes of Markdown
saved: <path>      # only when -s
When saving, just report. When not saving, also stream the converted Markdown back to the user; if it exceeds ~80 lines, show the first 80 and tell the user to re-run with
-s
to capture the full output.
markitdown: <input> → <bytes> 字节的Markdown内容
saved: <path>      # 仅在使用-s参数时显示
保存时,仅生成报告。不保存时,同时将转换后的Markdown返回给用户;若内容超过约80行,则显示前80行并告知用户需使用
-s
参数重新运行以获取完整输出。

Examples

示例

bash
/markitdown ~/Downloads/report.pdf            # convert, print to terminal
/markitdown -s ~/Downloads/report.pdf         # convert + save under ~/.claude/output/<project>/markitdown/report/
/markitdown -s -p deck.pptx                   # use third-party plugins (e.g. markitdown-ocr)
/markitdown -d invoice.pdf                    # Azure Document Intelligence
/markitdown -k brand.html                     # keep base64 images inline
/markitdown https://youtu.be/dQw4w9WgXcQ      # YouTube transcript
/markitdown -l                                # list installed plugins, then exit
bash
/markitdown ~/Downloads/report.pdf            # 转换并打印到终端
/markitdown -s ~/Downloads/report.pdf         # 转换并保存至~/.claude/output/<project>/markitdown/report/
/markitdown -s -p deck.pptx                   # 使用第三方插件(如markitdown-ocr)
/markitdown -d invoice.pdf                    # 使用Azure Document Intelligence
/markitdown -k brand.html                     # 在输出中保留base64图片
/markitdown https://youtu.be/dQw4w9WgXcQ      # 提取YouTube字幕
/markitdown -l                                # 列出已安装的插件,然后退出

Notes

注意事项

  • YouTube URLs are detected by the
    https?://
    prefix and passed straight to
    markitdown
    . The slug is derived from the URL's last path segment, so saved paths look like
    ~/.claude/output/<project>/markitdown/dqw4w9wgxcq/dQw4w9WgXcQ.md
    .
  • Audio transcription uses local Whisper via the
    [audio-transcription]
    extra. It's CPU-bound — warn the user before kicking off a long podcast.
  • Image OCR without the
    markitdown-ocr
    plugin only reads embedded EXIF text. For pixel-level OCR,
    pip install markitdown-ocr
    and pass
    -p
    .
  • No silent overwrites
    markitdown
    itself overwrites with
    -o
    , but the slug-namespaced save path makes collisions predictable, not surprising.
  • YouTube URL通过
    https?://
    前缀识别,并直接传递给
    markitdown
    。slug由URL的最后一段路径生成,因此保存路径类似
    ~/.claude/output/<project>/markitdown/dqw4w9wgxcq/dQw4w9WgXcQ.md
  • 音频转录通过
    [audio-transcription]
    扩展包使用本地Whisper实现。该操作受CPU性能限制——处理长播客前需提醒用户。
  • markitdown-ocr
    插件时的图片OCR
    仅读取嵌入的EXIF文本。如需像素级OCR识别,请执行
    pip install markitdown-ocr
    并传递
    -p
    参数。
  • 无静默覆盖——
    markitdown
    本身可通过
    -o
    参数覆盖文件,但基于slug命名的保存路径可让冲突可预测,而非意外发生。

Why the wrapper

为何需要封装

markitdown
is already a great CLI; this skill exists to (a) follow the repo's
-s/-S/-f
convention so other skills can chain on the output, (b) translate "extract this pdf" into the right invocation without forcing the user to remember
-x
,
-m
,
-d
,
-e
, and (c) emit a uniform one-line report so terminals don't render multi-MB Markdown by accident.
markitdown
本身已是一款优秀的CLI工具;本工具的存在是为了:(a) 遵循仓库的
-s/-S/-f
约定,以便其他工具可以链式调用输出;(b) 将“提取这个pdf”这类需求转换为正确的调用命令,无需用户记住
-x
-m
-d
-e
等参数;(c) 生成统一的单行报告,避免终端意外渲染数MB大小的Markdown内容。