firecrawl-parse
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesefirecrawl parse
firecrawl parse
Turn a local document into clean markdown on disk. Supports PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, HTML/HTM/XHTML.
将本地文档转换为磁盘上的干净Markdown格式。支持PDF、DOCX、DOC、ODT、RTF、XLSX、XLS、HTML/HTM/XHTML格式。
When to use
使用场景
- You have a file on disk (not a URL) and want its text as markdown
- User drops a PDF/DOCX and asks what it says, or to summarize it
- Use instead when the source is a URL
scrape
- 你有磁盘上的文件(而非URL),希望将其文本转换为Markdown格式
- 用户上传PDF/DOCX并询问内容,或要求生成摘要
- 当来源是URL时,请改用
scrape
Quick start
快速开始
Always save to with — parsed docs can be hundreds of KB and blow up context if streamed to stdout. Add to .
.firecrawl/-o.firecrawl/.gitignorebash
mkdir -p .firecrawl始终使用选项保存到目录——解析后的文档可能达数百KB大小,如果输出到标准输出会占用大量上下文。将添加到中。
-o.firecrawl/.firecrawl/.gitignorebash
mkdir -p .firecrawlFile → markdown
文件 → Markdown
firecrawl parse ./paper.pdf -o .firecrawl/paper.md
firecrawl parse ./paper.pdf -o .firecrawl/paper.md
AI summary
AI摘要
firecrawl parse ./paper.pdf -S -o .firecrawl/paper-summary.md
firecrawl parse ./paper.pdf -S -o .firecrawl/paper-summary.md
Ask a question about the doc
针对文档提问
firecrawl parse ./paper.pdf -Q "What are the main conclusions?"
-o .firecrawl/paper-qa.md
-o .firecrawl/paper-qa.md
Then `head`, `grep`, `rg` etc., or incrementally read the file - don't load the whole thing at once.firecrawl parse ./paper.pdf -Q "What are the main conclusions?"
-o .firecrawl/paper-qa.md
-o .firecrawl/paper-qa.md
之后可使用`head`、`grep`、`rg`等命令,或增量读取文件——不要一次性加载整个文件。Options
选项
| Option | Description |
|---|---|
| AI-generated summary |
| Ask a question about the parsed content |
| Output file path — always use this |
| |
| Timeout for the parse job |
| Show request duration |
| 选项 | 描述 |
|---|---|
| AI生成的摘要 |
| 针对解析后的内容提问 |
| 输出文件路径 —— 务必使用此选项 |
| |
| 解析任务的超时时间 |
| 显示请求耗时 |
Tips
提示
- Quote paths with spaces: .
firecrawl parse "./My Doc.pdf" -o .firecrawl/mydoc.md - Max upload size: 50 MB per file.
- Credits: ~1 per PDF page; HTML is 1 flat.
- Check before re-parsing the same file.
.firecrawl/ - To check your credit balance (recommended for batch processing and similar workflows), use the command.
firecrawl credit-usage
- 对包含空格的路径加引号:。
firecrawl parse "./My Doc.pdf" -o .firecrawl/mydoc.md - 最大上传大小:单文件50 MB。
- 积分消耗:每页PDF约1积分;HTML文件固定1积分。
- 重新解析同一文件前,请检查目录。
.firecrawl/ - 如需查看积分余额(批量处理等场景推荐),请使用命令。
firecrawl credit-usage
See also
另请参阅
- firecrawl-scrape — same idea for URLs
- firecrawl-scrape —— 针对URL的同类工具