pp-firecrawl
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFirecrawl — Printing Press CLI
Firecrawl — Printing Press CLI
Prerequisites: Install the CLI
前提条件:安装CLI
This skill drives the binary. You must verify the CLI is installed before invoking any command from this skill. If it is missing, install it first:
firecrawl-pp-cli- Install via the Printing Press installer:
bash
npx -y @mvanhorn/printing-press install firecrawl --cli-only - Verify:
firecrawl-pp-cli --version - Ensure (or
$GOPATH/bin) is on$HOME/go/bin.$PATH
If the install fails (no Node, offline, etc.), fall back to a direct Go install (requires Go 1.23+):
npxbash
go install github.com/mvanhorn/printing-press-library/library/developer-tools/firecrawl/cmd/firecrawl-pp-cli@latestIf reports "command not found" after install, the install step did not put the binary on . Do not proceed with skill commands until verification succeeds.
--version$PATH本技能基于二进制文件运行。在调用本技能的任何命令前,必须确认CLI已安装。如果未安装,请先执行以下步骤:
firecrawl-pp-cli- 通过Printing Press安装程序安装:
bash
npx -y @mvanhorn/printing-press install firecrawl --cli-only - 验证安装:
firecrawl-pp-cli --version - 确保(或
$GOPATH/bin)已添加到$HOME/go/bin环境变量中。$PATH
如果安装失败(如无Node环境、离线等),可改用Go直接安装(要求Go 1.23及以上版本):
npxbash
go install github.com/mvanhorn/printing-press-library/library/developer-tools/firecrawl/cmd/firecrawl-pp-cli@latest如果安装后执行提示“command not found”,说明安装程序未将二进制文件添加到中。请在验证成功前不要继续执行技能命令。
--version$PATHCommand Reference
命令参考
batch — Manage batch
- — Cancel a batch scrape job
firecrawl-pp-cli batch cancel-scrape - — Get the errors of a batch scrape job
firecrawl-pp-cli batch get-scrape-errors - — Get the status of a batch scrape job
firecrawl-pp-cli batch get-scrape-status - — Scrape multiple URLs and optionally extract information using an LLM
firecrawl-pp-cli batch scrape-and-extract-from-urls
crawl — Manage crawl
- — Cancel a crawl job
firecrawl-pp-cli crawl cancel - — Get all active crawls for the authenticated team
firecrawl-pp-cli crawl get-active - — Get the status of a crawl job
firecrawl-pp-cli crawl get-status - — Crawl multiple URLs based on options
firecrawl-pp-cli crawl urls
deep-research — Manage deep research
- — Get the status and results of a deep research operation
firecrawl-pp-cli deep-research get-status - — Start a deep research operation on a query
firecrawl-pp-cli deep-research start
extract — Manage extract
- — Extract structured data from pages using LLMs
firecrawl-pp-cli extract data - — Get the status of an extract job
firecrawl-pp-cli extract get-status
firecrawl-search — Manage firecrawl search
- — Search and optionally scrape search results
firecrawl-pp-cli firecrawl-search
llmstxt — Manage llmstxt
- — Generate LLMs.txt for a website
firecrawl-pp-cli llmstxt generate-llms-txt - — Get the status and results of an LLMs.txt generation job
firecrawl-pp-cli llmstxt get-llms-txt-status
map — Manage map
- — Map multiple URLs based on options
firecrawl-pp-cli map
scrape — Manage scrape
- — Scrape a single URL and optionally extract information using an LLM
firecrawl-pp-cli scrape
team — Manage team
- — Get remaining credits for the authenticated team
firecrawl-pp-cli team get-credit-usage - — Get remaining tokens for the authenticated team (Extract only)
firecrawl-pp-cli team get-token-usage
batch — 管理批量任务
- — 取消批量抓取任务
firecrawl-pp-cli batch cancel-scrape - — 获取批量抓取任务的错误信息
firecrawl-pp-cli batch get-scrape-errors - — 获取批量抓取任务的状态
firecrawl-pp-cli batch get-scrape-status - — 抓取多个URL并可选地使用LLM提取信息
firecrawl-pp-cli batch scrape-and-extract-from-urls
crawl — 管理爬取任务
- — 取消爬取任务
firecrawl-pp-cli crawl cancel - — 获取已认证团队的所有活跃爬取任务
firecrawl-pp-cli crawl get-active - — 获取爬取任务的状态
firecrawl-pp-cli crawl get-status - — 根据选项爬取多个URL
firecrawl-pp-cli crawl urls
deep-research — 管理深度研究任务
- — 获取深度研究操作的状态和结果
firecrawl-pp-cli deep-research get-status - — 针对查询启动深度研究操作
firecrawl-pp-cli deep-research start
extract — 管理提取任务
- — 使用LLM从页面提取结构化数据
firecrawl-pp-cli extract data - — 获取提取任务的状态
firecrawl-pp-cli extract get-status
firecrawl-search — 管理Firecrawl搜索
- — 搜索并可选地抓取搜索结果
firecrawl-pp-cli firecrawl-search
llmstxt — 管理llmstxt
- — 为网站生成LLMs.txt文件
firecrawl-pp-cli llmstxt generate-llms-txt - — 获取LLMs.txt生成任务的状态和结果
firecrawl-pp-cli llmstxt get-llms-txt-status
map — 管理URL映射
- — 根据选项映射多个URL
firecrawl-pp-cli map
scrape — 管理单页抓取
- — 抓取单个URL并可选地使用LLM提取信息
firecrawl-pp-cli scrape
team — 管理团队信息
- — 获取已认证团队的剩余积分
firecrawl-pp-cli team get-credit-usage - — 获取已认证团队的剩余令牌(仅用于提取任务)
firecrawl-pp-cli team get-token-usage
Finding the right command
查找合适的命令
When you know what you want to do but not which command does it, ask the CLI directly:
bash
firecrawl-pp-cli which "<capability in your own words>"which02--help当你知道要执行的操作但不确定对应命令时,可直接询问CLI:
bash
firecrawl-pp-cli which "<用自己的话描述功能>"which02--helpAuth Setup
身份验证设置
Store your access token:
bash
firecrawl-pp-cli auth set-token YOUR_TOKEN_HEREOr set as an environment variable.
FIRECRAWL_BEARER_AUTHRun to verify setup.
firecrawl-pp-cli doctor存储你的访问令牌:
bash
firecrawl-pp-cli auth set-token YOUR_TOKEN_HERE或者设置环境变量。
FIRECRAWL_BEARER_AUTH执行验证设置是否正确。
firecrawl-pp-cli doctorAgent Mode
Agent模式
Add to any command. Expands to: .
--agent--json --compact --no-input --no-color --yes-
Pipeable — JSON on stdout, errors on stderr
-
Filterable —keeps a subset of fields. Dotted paths descend into nested structures; arrays traverse element-wise. Critical for keeping context small on verbose APIs:
--selectbashfirecrawl-pp-cli batch cancel-scrape mock-value --agent --select id,name,status -
Previewable —shows the request without sending
--dry-run -
Offline-friendly — sync/search commands can use the local SQLite store when available
-
Non-interactive — never prompts, every input is a flag
在任何命令后添加参数,等效于添加:。
--agent--json --compact --no-input --no-color --yes-
可管道传输 — 标准输出为JSON格式,错误信息输出到标准错误流
-
可过滤 —参数保留字段子集。点路径可深入嵌套结构;数组可遍历元素。对于输出繁多的API,这是缩小上下文范围的关键:
--selectbashfirecrawl-pp-cli batch cancel-scrape mock-value --agent --select id,name,status -
可预览 —参数显示请求内容但不发送
--dry-run -
离线友好 — 同步/搜索命令在本地SQLite存储可用时可使用该存储
-
非交互式 — 从不提示输入,所有输入均通过参数传递
Response envelope
响应信封
Commands that read from the local store or the API wrap output in a provenance envelope:
json
{
"meta": {"source": "live" | "local", "synced_at": "...", "reason": "..."},
"results": <data>
}Parse for data and to know whether it's live or local. A human-readable summary is printed to stderr only when stdout is a terminal — piped/agent consumers get pure JSON on stdout.
.results.meta.sourceN results (live)从本地存储或API读取数据的命令会将输出包装在来源信封中:
json
{
"meta": {"source": "live" | "local", "synced_at": "...", "reason": "..."},
"results": <data>
}解析获取数据,通过判断数据是实时的还是本地的。只有当标准输出为终端时,才会向标准错误流打印人类可读的摘要——管道/Agent消费者只会在标准输出获取纯JSON内容。
.results.meta.sourceN results (live)Agent Feedback
Agent反馈
When you (or the agent) notice something off about this CLI, record it:
firecrawl-pp-cli feedback "the --since flag is inclusive but docs say exclusive"
firecrawl-pp-cli feedback --stdin < notes.txt
firecrawl-pp-cli feedback list --json --limit 10Entries are stored locally at . They are never POSTed unless is set AND either is passed or . Default behavior is local-only.
~/.firecrawl-pp-cli/feedback.jsonlFIRECRAWL_FEEDBACK_ENDPOINT--sendFIRECRAWL_FEEDBACK_AUTO_SEND=trueWrite what surprised you, not a bug report. Short, specific, one line: that is the part that compounds.
当你(或Agent)发现CLI存在问题时,可记录反馈:
firecrawl-pp-cli feedback "the --since flag is inclusive but docs say exclusive"
firecrawl-pp-cli feedback --stdin < notes.txt
firecrawl-pp-cli feedback list --json --limit 10反馈条目存储在本地文件中。除非设置了且传递了参数或设置,否则反馈不会被POST发送。默认行为是仅本地存储。
~/.firecrawl-pp-cli/feedback.jsonlFIRECRAWL_FEEDBACK_ENDPOINT--sendFIRECRAWL_FEEDBACK_AUTO_SEND=true请记录让你感到意外的内容,而非正式的bug报告。简短、具体、单行描述:这样的反馈更有价值。
Output Delivery
输出交付
Every command accepts . The output goes to the named sink in addition to (or instead of) stdout, so agents can route command results without hand-piping. Three sinks are supported:
--deliver <sink>| Sink | Effect |
|---|---|
| Default; write to stdout only |
| Atomically write output to |
| POST the output body to the URL ( |
Unknown schemes are refused with a structured error naming the supported set. Webhook failures return non-zero and log the URL + HTTP status on stderr.
每个命令都支持参数。输出会发送到指定的sink,同时(或替代)输出到标准输出,因此Agent无需手动管道即可路由命令结果。支持三种sink:
--deliver <sink>| Sink | 效果 |
|---|---|
| 默认值;仅输出到标准输出 |
| 原子性地将输出写入 |
| 将输出体POST到指定URL( |
未知的协议会被拒绝,并返回结构化错误信息列出支持的协议集。Webhook失败时返回非零退出码,并在标准错误流记录URL和HTTP状态码。
Named Profiles
命名配置文件
A profile is a saved set of flag values, reused across invocations. Use it when a scheduled agent calls the same command every run with the same configuration - HeyGen's "Beacon" pattern.
firecrawl-pp-cli profile save briefing --json
firecrawl-pp-cli --profile briefing batch cancel-scrape mock-value
firecrawl-pp-cli profile list --json
firecrawl-pp-cli profile show briefing
firecrawl-pp-cli profile delete briefing --yesExplicit flags always win over profile values; profile values win over defaults. lists all available profiles under so introspecting agents discover them at runtime.
agent-contextavailable_profiles配置文件是一组保存的参数值,可在多次调用中复用。当调度Agent每次运行都使用相同配置调用同一命令时,可使用此功能——即HeyGen的“Beacon”模式。
firecrawl-pp-cli profile save briefing --json
firecrawl-pp-cli --profile briefing batch cancel-scrape mock-value
firecrawl-pp-cli profile list --json
firecrawl-pp-cli profile show briefing
firecrawl-pp-cli profile delete briefing --yes显式参数始终优先于配置文件值;配置文件值优先于默认值。会在下列出所有可用配置文件,以便内省Agent在运行时发现它们。
agent-contextavailable_profilesExit Codes
退出码
| Code | Meaning |
|---|---|
| 0 | Success |
| 2 | Usage error (wrong arguments) |
| 3 | Resource not found |
| 4 | Authentication required |
| 5 | API error (upstream issue) |
| 7 | Rate limited (wait and retry) |
| 10 | Config error |
| 代码 | 含义 |
|---|---|
| 0 | 成功 |
| 2 | 使用错误(参数错误) |
| 3 | 资源未找到 |
| 4 | 需要身份验证 |
| 5 | API错误(上游问题) |
| 7 | 请求受限(请等待后重试) |
| 10 | 配置错误 |
Argument Parsing
参数解析
Parse :
$ARGUMENTS- Empty, , or
help→ show--helpoutputfirecrawl-pp-cli --help - Starts with → ends with
install→ MCP installation; otherwise → see Prerequisites abovemcp - Anything else → Direct Use (execute as CLI command with )
--agent
解析的规则:
$ARGUMENTS- 为空、或
help→ 显示--help输出firecrawl-pp-cli --help - 以开头 → 若结尾为
install→ 安装MCP;否则 → 参见上方前提条件部分mcp - 其他情况 → 直接使用(以参数执行CLI命令)
--agent
MCP Server Installation
MCP服务器安装
- Install the MCP server:
bash
go install github.com/mvanhorn/printing-press-library/library/other/firecrawl-pp-cli/cmd/firecrawl-pp-mcp@latest - Register with Claude Code:
bash
claude mcp add firecrawl-pp-mcp -- firecrawl-pp-mcp - Verify:
claude mcp list
- 安装MCP服务器:
bash
go install github.com/mvanhorn/printing-press-library/library/other/firecrawl-pp-cli/cmd/firecrawl-pp-mcp@latest - 在Claude Code中注册:
bash
claude mcp add firecrawl-pp-mcp -- firecrawl-pp-mcp - 验证:
claude mcp list
Direct Use
直接使用
- Check if installed: If not found, offer to install (see Prerequisites at the top of this skill).
which firecrawl-pp-cli - Match the user query to the best command from the Unique Capabilities and Command Reference above.
- Execute with the flag:
--agentbashfirecrawl-pp-cli <command> [subcommand] [args] --agent - If ambiguous, drill into subcommand help: .
firecrawl-pp-cli <command> --help
- 检查是否已安装:若未找到,提供安装选项(参见本技能顶部的前提条件部分)。
which firecrawl-pp-cli - 将用户查询与上方“独特功能”和“命令参考”中的最佳匹配命令关联。
- 使用参数执行命令:
--agentbashfirecrawl-pp-cli <command> [subcommand] [args] --agent - 若存在歧义,查看子命令帮助:。
firecrawl-pp-cli <command> --help