firecrawl-build-scrape

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Firecrawl Build Scrape

Firecrawl 单页爬取能力构建

Use this when the application already has the URL and needs content from one page.

当应用已经获取到URL，需要提取单个页面的内容时使用本方案。

Use This When

适用场景

the feature starts from a known URL
you need page content for retrieval, summarization, enrichment, or monitoring
you want the default extraction primitive before considering
```
/interact
```

功能从已知URL启动
需要获取页面内容用于检索、摘要、信息补全或监控
在考虑使用
```
/interact
```
之前，你需要默认的提取原语

Default Recommendations

默认推荐配置

Start with
```
/scrape
```
, not
```
/crawl
```
.
Return
```
markdown
```
unless the feature truly needs another format.
Use
```
onlyMainContent
```
for article-like pages where nav and chrome add noise.
Add waits or other rendering options only when the page needs them.

优先使用
```
/scrape
```
，而非
```
/crawl
```
默认返回
```
markdown
```
格式，除非功能确实需要其他格式
对于文章类页面，导航栏和页面冗余元素会带来噪声，使用
```
onlyMainContent
```
参数提取
仅当页面有需要时，才添加等待或其他渲染选项

Common Product Patterns

常见产品使用模式

knowledge ingestion from known URLs
enrichment from a company, product, or docs page
pricing, changelog, and documentation extraction
page-level quality checks or monitoring

从已知URL导入知识内容
从公司、产品或文档页面补充丰富信息
价格、更新日志和文档内容提取
页面级质量检查或监控

Escalation Rules

升级处理规则

If you do not have the URL yet, start with firecrawl-build-search.
If content requires clicks, typing, or multi-step navigation, escalate to firecrawl-build-interact.
If you need many pages from the same site, consider firecrawl-build-crawl or firecrawl-build-map.

如果你还没有获取到URL，优先使用firecrawl-build-search
如果内容需要点击、输入或多步骤导航才能获取，升级使用firecrawl-build-interact
如果你需要提取同一站点的多个页面，考虑使用firecrawl-build-crawl或firecrawl-build-map

Implementation Notes

实现注意事项

Keep the integration narrow: one feature, one URL, one extraction contract.
Treat
```
/scrape
```
as the default primitive for downstream LLM or indexing pipelines.
Request richer formats only when the consumer needs them, such as links, screenshots, or branding data.

保持集成的专一性：一个功能、一个URL、一个提取契约
将
```
/scrape
```
作为下游LLM或索引流水线的默认提取原语
仅当使用方有需求时，才请求更丰富的返回格式，例如链接、截图或品牌数据

Docs (Source of Truth)

官方文档（权威参考）

Read the docs for request/response schemas, parameters, and SDK examples before writing integration code:

docs.firecrawl.dev/features/scrape

编写集成代码前，请阅读官方文档了解请求/响应schema、参数和SDK示例：

docs.firecrawl.dev/features/scrape