mistral-ocr
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMistral OCR
Mistral OCR
Extract text from images and PDFs using Mistral's dedicated OCR API. No external dependencies required.
使用Mistral专属OCR API从图片和PDF中提取文本,无需外部依赖。
Requirements
前提条件
This skill requires a Mistral API key. If you don't have one, follow the guide in reference/getting-started.md.
该Skill需要Mistral API密钥。如果您还没有,请遵循reference/getting-started.md中的指南获取。
API Key
API密钥
The user must provide their Mistral API key. Ask for it if not available.
Option 1 (Recommended for AI agents): User provides key directly in message:
"Use this Mistral key: aBc123XyZ..."
"Convert this PDF to markdown, my API key is aBc123XyZ..."Option 2: Environment variable
$MISTRAL_API_KEYOption 3: Claude Code settings ()
~/.claude/settings.jsonIf no key is available, guide the user to get one at console.mistral.ai.
用户必须提供自己的Mistral API密钥。如果未获取到,请向用户索要。
选项1(推荐AI Agent使用):用户在消息中直接提供密钥:
"Use this Mistral key: aBc123XyZ..."
"Convert this PDF to markdown, my API key is aBc123XyZ..."选项2:环境变量
$MISTRAL_API_KEY选项3:Claude Code设置()
~/.claude/settings.json如果没有可用密钥,引导用户前往console.mistral.ai获取。
API Endpoint
API端点
Use the dedicated OCR endpoint for all document processing:
POST https://api.mistral.ai/v1/ocrModel:
mistral-ocr-latest使用专属OCR端点处理所有文档:
POST https://api.mistral.ai/v1/ocr模型:
mistral-ocr-latestFeatures
功能特性
1. PDF → Markdown (Direct, no conversion needed!)
1. PDF → Markdown(直接转换,无需额外步骤!)
bash
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "document_url",
"document_url": "https://example.com/document.pdf"
}
}'bash
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "document_url",
"document_url": "https://example.com/document.pdf"
}
}'2. Image → Text
2. 图片 → 文本
Works with JPG, PNG, WEBP, GIF:
bash
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "image_url",
"image_url": "https://example.com/image.jpg"
}
}'支持JPG、PNG、WEBP、GIF格式:
bash
curl -s "https://api.mistral.ai/v1/ocr" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr-latest",
"document": {
"type": "image_url",
"image_url": "https://example.com/image.jpg"
}
}'3. Local Files (Base64 Data URL)
3. 本地文件(Base64数据URL)
For local PDFs or images, encode as base64 and use a data URL.
ALWAYS use curl (works on all platforms including Windows via Git Bash):
bash
undefined对于本地PDF或图片,将其编码为base64格式并使用数据URL。
请始终使用curl(适用于所有平台,包括Windows的Git Bash):
bash
undefinedFor local PDF
处理本地PDF
BASE64=$(base64 -w0 document.pdf)
curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "document_url", "document_url": "data:application/pdf;base64,'"$BASE64"'" } }'
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "document_url", "document_url": "data:application/pdf;base64,'"$BASE64"'" } }'
BASE64=$(base64 -w0 document.pdf)
curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "document_url", "document_url": "data:application/pdf;base64,'"$BASE64"'" } }'
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "document_url", "document_url": "data:application/pdf;base64,'"$BASE64"'" } }'
For local images (PNG, JPG, etc.)
处理本地图片(PNG、JPG等)
BASE64=$(base64 -w0 image.png)
curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "image_url", "image_url": "data:image/png;base64,'"$BASE64"'" } }'
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "image_url", "image_url": "data:image/png;base64,'"$BASE64"'" } }'
**MIME types:**
- PDF: `data:application/pdf;base64,...`
- PNG: `data:image/png;base64,...`
- JPG: `data:image/jpeg;base64,...`
- WEBP: `data:image/webp;base64,...`BASE64=$(base64 -w0 image.png)
curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "image_url", "image_url": "data:image/png;base64,'"$BASE64"'" } }'
-H "Authorization: Bearer $MISTRAL_API_KEY"
-H "Content-Type: application/json"
-d '{ "model": "mistral-ocr-latest", "document": { "type": "image_url", "image_url": "data:image/png;base64,'"$BASE64"'" } }'
**MIME类型:**
- PDF: `data:application/pdf;base64,...`
- PNG: `data:image/png;base64,...`
- JPG: `data:image/jpeg;base64,...`
- WEBP: `data:image/webp;base64,...`4. Structured JSON Output
4. 结构化JSON输出
For invoices, forms, tables - ask for JSON in a follow-up or use Document AI annotations.
对于发票、表单、表格,可以在后续请求中要求返回JSON格式,或使用Document AI注释功能。
Response Format
响应格式
The API returns markdown directly:
json
{
"pages": [
{
"index": 0,
"markdown": "# Document Title\n\nExtracted content here...",
"images": [],
"tables": [],
"dimensions": {"dpi": 200, "height": 842, "width": 595}
}
],
"model": "mistral-ocr-latest",
"usage_info": {"pages_processed": 1, "doc_size_bytes": 12345}
}API直接返回Markdown格式内容:
json
{
"pages": [
{
"index": 0,
"markdown": "# Document Title\n\nExtracted content here...",
"images": [],
"tables": [],
"dimensions": {"dpi": 200, "height": 842, "width": 595}
}
],
"model": "mistral-ocr-latest",
"usage_info": {"pages_processed": 1, "doc_size_bytes": 12345}
}Workflow
工作流程
User requests OCR from image or PDF
用户请求对图片或PDF进行OCR识别
- Get API key - Ask user if not in environment
- Determine input type (URL or local file)
- For local files, ALWAYS use temp file approach (avoids "Argument list too long" error):
bash
undefined- 获取API密钥 - 如果环境变量中没有,向用户索要
- 确定输入类型(URL或本地文件)
- 对于本地文件,始终使用临时文件方法(避免"参数列表过长"错误):
bash
undefinedCross-platform temp directory
跨平台临时目录
TMPDIR="${TMPDIR:-${TEMP:-/tmp}}"
TMPDIR="${TMPDIR:-${TEMP:-/tmp}}"
Step 1: Encode file to base64
步骤1:将文件编码为base64格式
base64 -w0 "document.pdf" > "$TMPDIR/b64.txt"
base64 -w0 "document.pdf" > "$TMPDIR/b64.txt"
Step 2: Create JSON request file
步骤2:创建JSON请求文件
echo '{"model":"mistral-ocr-latest","document":{"type":"document_url","document_url":"data:application/pdf;base64,'$(cat "$TMPDIR/b64.txt")'"}}' > "$TMPDIR/request.json"
echo '{"model":"mistral-ocr-latest","document":{"type":"document_url","document_url":"data:application/pdf;base64,'$(cat "$TMPDIR/b64.txt")'"}}' > "$TMPDIR/request.json"
Step 3: Call API with -d @file (use actual key, not variable)
步骤3:使用-d @file调用API(请使用实际密钥,而非变量)
curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer YOUR_API_KEY_HERE"
-H "Content-Type: application/json"
-d @"$TMPDIR/request.json" > "$TMPDIR/response.json"
-H "Authorization: Bearer YOUR_API_KEY_HERE"
-H "Content-Type: application/json"
-d @"$TMPDIR/request.json" > "$TMPDIR/response.json"
curl -s "https://api.mistral.ai/v1/ocr"
-H "Authorization: Bearer YOUR_API_KEY_HERE"
-H "Content-Type: application/json"
-d @"$TMPDIR/request.json" > "$TMPDIR/response.json"
-H "Authorization: Bearer YOUR_API_KEY_HERE"
-H "Content-Type: application/json"
-d @"$TMPDIR/request.json" > "$TMPDIR/response.json"
Step 4: Extract markdown with node (NOT jq - not available on all systems)
步骤4:使用node提取Markdown(请勿使用jq - 并非所有系统都支持)
node -e "const fs=require('fs'); const r=JSON.parse(fs.readFileSync('$TMPDIR/response.json')); console.log(r.pages.map(p=>p.markdown).join('\n\n---\n\n'))"
4. **Save to .md file** using Write tool
5. Confirm file location to usernode -e "const fs=require('fs'); const r=JSON.parse(fs.readFileSync('$TMPDIR/response.json')); console.log(r.pages.map(p=>p.markdown).join('\n\n---\n\n'))"
4. **使用Write工具保存为.md文件**
5. 向用户确认文件保存位置IMPORTANT: Cross-Platform Compatibility
重要提示:跨平台兼容性
- ALWAYS use curl (works on Windows via Git Bash)
- ALWAYS use for request body (handles large files)
-d @file - NEVER use jq - use node instead to parse JSON
- Use for temp files (works on all systems)
${TMPDIR:-${TEMP:-/tmp}} - Copy response.json to user directory before parsing with node on Windows
- 始终使用curl(适用于Windows的Git Bash)
- **始终使用**传递请求体(处理大文件)
-d @file - 请勿使用jq - 改用node解析JSON
- **使用**存储临时文件(适用于所有系统)
${TMPDIR:-${TEMP:-/tmp}} - 在Windows上,解析前将response.json复制到用户目录
Usage Examples
使用示例
When the user says:
| User Request | Action |
|---|---|
| "Convert this PDF to markdown" | OCR the PDF, save as .md file |
| "Extract text from this image" | OCR the image, return text |
| "Give me a .md of this document" | OCR and save as .md file |
| "What does this PDF say?" | OCR and summarize content |
| "OCR this receipt" | Extract text, optionally structure as JSON |
当用户提出以下请求时:
| 用户请求 | 操作 |
|---|---|
| "把这个PDF转换成Markdown" | 对PDF进行OCR识别,保存为.md文件 |
| "提取这张图片里的文本" | 对图片进行OCR识别,返回文本内容 |
| "给我这个文档的.md版本" | 进行OCR识别并保存为.md文件 |
| "这个PDF里写了什么?" | 进行OCR识别并总结内容 |
| "识别这张收据的内容" | 提取文本,可选转换为结构化JSON格式 |
Error Handling
错误处理
| Error | Cause | Solution |
|---|---|---|
| 401 Unauthorized | Invalid API key | Verify key, guide to getting-started.md |
| 400 Bad Request | Invalid document | Check format and URL accessibility |
| 3310 File fetch error | URL not accessible | Use base64 for local files |
| Rate limit | Too many requests | Wait and retry |
| 错误 | 原因 | 解决方案 |
|---|---|---|
| 401 Unauthorized | API密钥无效 | 验证密钥,引导用户查看getting-started.md |
| 400 Bad Request | 文档无效 | 检查格式和URL可访问性 |
| 3310 File fetch error | URL无法访问 | 对本地文件使用base64编码 |
| 速率限制 | 请求过于频繁 | 等待后重试 |
Supported Formats
支持的格式
| Format | Support |
|---|---|
| ✅ Direct (no conversion) | |
| PNG | ✅ Direct |
| JPG/JPEG | ✅ Direct |
| WEBP | ✅ Direct |
| GIF | ✅ Direct |
No external dependencies required! Unlike other OCR solutions, Mistral OCR handles PDFs directly without needing pdftoppm, ImageMagick, or any other tools.
| 格式 | 支持情况 |
|---|---|
| ✅ 直接支持(无需转换) | |
| PNG | ✅ 直接支持 |
| JPG/JPEG | ✅ 直接支持 |
| WEBP | ✅ 直接支持 |
| GIF | ✅ 直接支持 |
无需外部依赖! 与其他OCR解决方案不同,Mistral OCR可直接处理PDF,无需pdftoppm、ImageMagick或任何其他工具。
Pricing
定价
As of 2025, Mistral OCR pricing:
- $2 per 1,000 pages
- 50% discount with Batch API
Check current rates at mistral.ai/pricing
References
参考资料
- Getting Started - How to get your API key
- PDF to Markdown - PDF conversion examples
- Output Formats - JSON, Markdown, plain text
- Step-by-Step Guide - Complete tutorial with examples
Skill by Parlamento AI
- 快速入门 - 如何获取API密钥
- PDF转Markdown - PDF转换示例
- 输出格式 - JSON、Markdown、纯文本
- 分步指南 - 完整示例教程
该Skill由Parlamento AI开发