pp-firecrawl

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Firecrawl — Printing Press CLI

Prerequisites: Install the CLI

前提条件：安装CLI

This skill drives the

firecrawl-pp-cli

binary. You must verify the CLI is installed before invoking any command from this skill. If it is missing, install it first:

Install via the Printing Press installer:

bash

npx -y @mvanhorn/printing-press install firecrawl --cli-only

Verify:
```
firecrawl-pp-cli --version
```
Ensure
```
$GOPATH/bin
```
(or
```
$HOME/go/bin
```
) is on
```
$PATH
```
.

If the

npx

install fails (no Node, offline, etc.), fall back to a direct Go install (requires Go 1.23+):

bash

go install github.com/mvanhorn/printing-press-library/library/developer-tools/firecrawl/cmd/firecrawl-pp-cli@latest

--version

reports "command not found" after install, the install step did not put the binary on

$PATH

. Do not proceed with skill commands until verification succeeds.

本技能基于

firecrawl-pp-cli

二进制文件运行。在调用本技能的任何命令前，必须确认CLI已安装。如果未安装，请先执行以下步骤：

通过Printing Press安装程序安装：

bash

npx -y @mvanhorn/printing-press install firecrawl --cli-only

验证安装：
```
firecrawl-pp-cli --version
```
确保
```
$GOPATH/bin
```
（或
```
$HOME/go/bin
```
）已添加到
```
$PATH
```
环境变量中。

如果

npx

安装失败（如无Node环境、离线等），可改用Go直接安装（要求Go 1.23及以上版本）：

bash

go install github.com/mvanhorn/printing-press-library/library/developer-tools/firecrawl/cmd/firecrawl-pp-cli@latest

如果安装后执行

--version

提示“command not found”，说明安装程序未将二进制文件添加到

$PATH

中。请在验证成功前不要继续执行技能命令。

Command Reference

命令参考

batch — Manage batch

```
firecrawl-pp-cli batch cancel-scrape
```
— Cancel a batch scrape job

firecrawl-pp-cli batch get-scrape-errors

— Get the errors of a batch scrape job

firecrawl-pp-cli batch get-scrape-status

— Get the status of a batch scrape job

```
firecrawl-pp-cli batch scrape-and-extract-from-urls
```
— Scrape multiple URLs and optionally extract information using an LLM

crawl — Manage crawl

```
firecrawl-pp-cli crawl cancel
```
— Cancel a crawl job
```
firecrawl-pp-cli crawl get-active
```
— Get all active crawls for the authenticated team
```
firecrawl-pp-cli crawl get-status
```
— Get the status of a crawl job
```
firecrawl-pp-cli crawl urls
```
— Crawl multiple URLs based on options

deep-research — Manage deep research

```
firecrawl-pp-cli deep-research get-status
```
— Get the status and results of a deep research operation
```
firecrawl-pp-cli deep-research start
```
— Start a deep research operation on a query

extract — Manage extract

```
firecrawl-pp-cli extract data
```
— Extract structured data from pages using LLMs
```
firecrawl-pp-cli extract get-status
```
— Get the status of an extract job

firecrawl-search — Manage firecrawl search

```
firecrawl-pp-cli firecrawl-search
```
— Search and optionally scrape search results

llmstxt — Manage llmstxt

firecrawl-pp-cli llmstxt generate-llms-txt

— Generate LLMs.txt for a website

```
firecrawl-pp-cli llmstxt get-llms-txt-status
```
— Get the status and results of an LLMs.txt generation job

map — Manage map

```
firecrawl-pp-cli map
```
— Map multiple URLs based on options

scrape — Manage scrape

```
firecrawl-pp-cli scrape
```
— Scrape a single URL and optionally extract information using an LLM

team — Manage team

```
firecrawl-pp-cli team get-credit-usage
```
— Get remaining credits for the authenticated team
```
firecrawl-pp-cli team get-token-usage
```
— Get remaining tokens for the authenticated team (Extract only)

batch — 管理批量任务

```
firecrawl-pp-cli batch cancel-scrape
```
— 取消批量抓取任务
```
firecrawl-pp-cli batch get-scrape-errors
```
— 获取批量抓取任务的错误信息

firecrawl-pp-cli batch get-scrape-status

— 获取批量抓取任务的状态

```
firecrawl-pp-cli batch scrape-and-extract-from-urls
```
— 抓取多个URL并可选地使用LLM提取信息

crawl — 管理爬取任务

```
firecrawl-pp-cli crawl cancel
```
— 取消爬取任务
```
firecrawl-pp-cli crawl get-active
```
— 获取已认证团队的所有活跃爬取任务
```
firecrawl-pp-cli crawl get-status
```
— 获取爬取任务的状态
```
firecrawl-pp-cli crawl urls
```
— 根据选项爬取多个URL

deep-research — 管理深度研究任务

```
firecrawl-pp-cli deep-research get-status
```
— 获取深度研究操作的状态和结果
```
firecrawl-pp-cli deep-research start
```
— 针对查询启动深度研究操作

extract — 管理提取任务

```
firecrawl-pp-cli extract data
```
— 使用LLM从页面提取结构化数据
```
firecrawl-pp-cli extract get-status
```
— 获取提取任务的状态

firecrawl-search — 管理Firecrawl搜索

```
firecrawl-pp-cli firecrawl-search
```
— 搜索并可选地抓取搜索结果

llmstxt — 管理llmstxt

firecrawl-pp-cli llmstxt generate-llms-txt

— 为网站生成LLMs.txt文件

```
firecrawl-pp-cli llmstxt get-llms-txt-status
```
— 获取LLMs.txt生成任务的状态和结果

map — 管理URL映射

```
firecrawl-pp-cli map
```
— 根据选项映射多个URL

scrape — 管理单页抓取

```
firecrawl-pp-cli scrape
```
— 抓取单个URL并可选地使用LLM提取信息

team — 管理团队信息

```
firecrawl-pp-cli team get-credit-usage
```
— 获取已认证团队的剩余积分
```
firecrawl-pp-cli team get-token-usage
```
— 获取已认证团队的剩余令牌（仅用于提取任务）

Finding the right command

查找合适的命令

When you know what you want to do but not which command does it, ask the CLI directly:

bash

firecrawl-pp-cli which "<capability in your own words>"

which

resolves a natural-language capability query to the best matching command from this CLI's curated feature index. Exit code

means at least one match; exit code

means no confident match — fall back to

--help

or use a narrower query.

当你知道要执行的操作但不确定对应命令时，可直接询问CLI：

bash

firecrawl-pp-cli which "<用自己的话描述功能>"

which

命令会将自然语言描述的功能查询与CLI的精选功能索引中的最佳匹配命令关联。退出码

表示至少有一个匹配项；退出码

表示没有确定的匹配项——此时可使用

--help

或更精确的查询词。

Auth Setup

身份验证设置

Store your access token:

bash

firecrawl-pp-cli auth set-token YOUR_TOKEN_HERE

Or set

FIRECRAWL_BEARER_AUTH

as an environment variable.

Run

firecrawl-pp-cli doctor

to verify setup.

存储你的访问令牌：

bash

firecrawl-pp-cli auth set-token YOUR_TOKEN_HERE

或者设置环境变量

FIRECRAWL_BEARER_AUTH

。

执行

firecrawl-pp-cli doctor

验证设置是否正确。

Agent Mode

Agent模式

Add

--agent

to any command. Expands to:

--json --compact --no-input --no-color --yes

Pipeable — JSON on stdout, errors on stderr
Filterable —
```
--select
```
keeps a subset of fields. Dotted paths descend into nested structures; arrays traverse element-wise. Critical for keeping context small on verbose APIs:
bash
```
firecrawl-pp-cli batch cancel-scrape mock-value --agent --select id,name,status
```
Previewable —
```
--dry-run
```
shows the request without sending
Offline-friendly — sync/search commands can use the local SQLite store when available
Non-interactive — never prompts, every input is a flag

在任何命令后添加

--agent

参数，等效于添加：

--json --compact --no-input --no-color --yes

。

可管道传输 — 标准输出为JSON格式，错误信息输出到标准错误流
可过滤 —
```
--select
```
参数保留字段子集。点路径可深入嵌套结构；数组可遍历元素。对于输出繁多的API，这是缩小上下文范围的关键：
bash
```
firecrawl-pp-cli batch cancel-scrape mock-value --agent --select id,name,status
```
可预览 —
```
--dry-run
```
参数显示请求内容但不发送
离线友好 — 同步/搜索命令在本地SQLite存储可用时可使用该存储
非交互式 — 从不提示输入，所有输入均通过参数传递

Response envelope

响应信封

Commands that read from the local store or the API wrap output in a provenance envelope:

json

{
  "meta": {"source": "live" | "local", "synced_at": "...", "reason": "..."},
  "results": <data>
}

Parse

.results

for data and

.meta.source

to know whether it's live or local. A human-readable

N results (live)

summary is printed to stderr only when stdout is a terminal — piped/agent consumers get pure JSON on stdout.

从本地存储或API读取数据的命令会将输出包装在来源信封中：

json

{
  "meta": {"source": "live" | "local", "synced_at": "...", "reason": "..."},
  "results": <data>
}

解析

.results

获取数据，通过

.meta.source

判断数据是实时的还是本地的。只有当标准输出为终端时，才会向标准错误流打印人类可读的

N results (live)

摘要——管道/Agent消费者只会在标准输出获取纯JSON内容。

Agent Feedback

Agent反馈

When you (or the agent) notice something off about this CLI, record it:

firecrawl-pp-cli feedback "the --since flag is inclusive but docs say exclusive"
firecrawl-pp-cli feedback --stdin < notes.txt
firecrawl-pp-cli feedback list --json --limit 10

Entries are stored locally at

~/.firecrawl-pp-cli/feedback.jsonl

. They are never POSTed unless

FIRECRAWL_FEEDBACK_ENDPOINT

is set AND either

--send

is passed or

FIRECRAWL_FEEDBACK_AUTO_SEND=true

. Default behavior is local-only.

Write what surprised you, not a bug report. Short, specific, one line: that is the part that compounds.

当你（或Agent）发现CLI存在问题时，可记录反馈：

firecrawl-pp-cli feedback "the --since flag is inclusive but docs say exclusive"
firecrawl-pp-cli feedback --stdin < notes.txt
firecrawl-pp-cli feedback list --json --limit 10

反馈条目存储在本地

~/.firecrawl-pp-cli/feedback.jsonl

文件中。除非设置了

FIRECRAWL_FEEDBACK_ENDPOINT

且传递了

--send

参数或设置

FIRECRAWL_FEEDBACK_AUTO_SEND=true

，否则反馈不会被POST发送。默认行为是仅本地存储。

请记录让你感到意外的内容，而非正式的bug报告。简短、具体、单行描述：这样的反馈更有价值。

Output Delivery

输出交付

Every command accepts

--deliver <sink>

. The output goes to the named sink in addition to (or instead of) stdout, so agents can route command results without hand-piping. Three sinks are supported:

Sink	Effect
`stdout`	Default; write to stdout only
`file:<path>`	Atomically write output to `<path>` (tmp + rename)
`webhook:<url>`	POST the output body to the URL ( `application/json` or `application/x-ndjson` when `--compact` )

Unknown schemes are refused with a structured error naming the supported set. Webhook failures return non-zero and log the URL + HTTP status on stderr.

每个命令都支持

--deliver <sink>

参数。输出会发送到指定的sink，同时（或替代）输出到标准输出，因此Agent无需手动管道即可路由命令结果。支持三种sink：

Sink	效果
`stdout`	默认值；仅输出到标准输出
`file:<path>`	原子性地将输出写入 `<path>` （先写入临时文件再重命名）
`webhook:<url>`	将输出体POST到指定URL（ `application/json` 格式，若使用 `--compact` 则为 `application/x-ndjson` 格式）

未知的协议会被拒绝，并返回结构化错误信息列出支持的协议集。Webhook失败时返回非零退出码，并在标准错误流记录URL和HTTP状态码。

Named Profiles

命名配置文件

A profile is a saved set of flag values, reused across invocations. Use it when a scheduled agent calls the same command every run with the same configuration - HeyGen's "Beacon" pattern.

firecrawl-pp-cli profile save briefing --json
firecrawl-pp-cli --profile briefing batch cancel-scrape mock-value
firecrawl-pp-cli profile list --json
firecrawl-pp-cli profile show briefing
firecrawl-pp-cli profile delete briefing --yes

Explicit flags always win over profile values; profile values win over defaults.

agent-context

lists all available profiles under

available_profiles

so introspecting agents discover them at runtime.

配置文件是一组保存的参数值，可在多次调用中复用。当调度Agent每次运行都使用相同配置调用同一命令时，可使用此功能——即HeyGen的“Beacon”模式。

firecrawl-pp-cli profile save briefing --json
firecrawl-pp-cli --profile briefing batch cancel-scrape mock-value
firecrawl-pp-cli profile list --json
firecrawl-pp-cli profile show briefing
firecrawl-pp-cli profile delete briefing --yes

显式参数始终优先于配置文件值；配置文件值优先于默认值。

agent-context

会在

available_profiles

下列出所有可用配置文件，以便内省Agent在运行时发现它们。

Exit Codes

退出码

Code	Meaning
0	Success
2	Usage error (wrong arguments)
3	Resource not found
4	Authentication required
5	API error (upstream issue)
7	Rate limited (wait and retry)
10	Config error

代码	含义
0	成功
2	使用错误（参数错误）
3	资源未找到
4	需要身份验证
5	API错误（上游问题）
7	请求受限（请等待后重试）
10	配置错误

Argument Parsing

参数解析

Parse

$ARGUMENTS

Empty,
help
, or
--help
→ show
```
firecrawl-pp-cli --help
```
output
Starts with
install
→ ends with
```
mcp
```
→ MCP installation; otherwise → see Prerequisites above
Anything else → Direct Use (execute as CLI command with
```
--agent
```
)

解析

$ARGUMENTS

的规则：

为空、
help
或
--help
→ 显示
```
firecrawl-pp-cli --help
```
输出
以
install
开头 → 若结尾为
```
mcp
```
→ 安装MCP；否则 → 参见上方前提条件部分
其他情况 → 直接使用（以
```
--agent
```
参数执行CLI命令）

MCP Server Installation

MCP服务器安装

Install the MCP server:

bash

go install github.com/mvanhorn/printing-press-library/library/other/firecrawl-pp-cli/cmd/firecrawl-pp-mcp@latest

bash

claude mcp add firecrawl-pp-mcp -- firecrawl-pp-mcp

Verify:
```
claude mcp list
```

安装MCP服务器：

bash

go install github.com/mvanhorn/printing-press-library/library/other/firecrawl-pp-cli/cmd/firecrawl-pp-mcp@latest

在Claude Code中注册：

bash

claude mcp add firecrawl-pp-mcp -- firecrawl-pp-mcp

验证：
```
claude mcp list
```

Direct Use

直接使用

Check if installed:
```
which firecrawl-pp-cli
```
If not found, offer to install (see Prerequisites at the top of this skill).
Match the user query to the best command from the Unique Capabilities and Command Reference above.

Execute with the

--agent

flag:

bash

firecrawl-pp-cli <command> [subcommand] [args] --agent

If ambiguous, drill into subcommand help:
```
firecrawl-pp-cli <command> --help
```
.

检查是否已安装：
```
which firecrawl-pp-cli
```
若未找到，提供安装选项（参见本技能顶部的前提条件部分）。
将用户查询与上方“独特功能”和“命令参考”中的最佳匹配命令关联。

使用

--agent

参数执行命令：

bash

firecrawl-pp-cli <command> [subcommand] [args] --agent

若存在歧义，查看子命令帮助：
```
firecrawl-pp-cli <command> --help
```
。