firecrawl-monitor
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesefirecrawl monitor
firecrawl 监控工具
Detect when content on a website changes and get notified by webhook or email. Each page in a check is labeled , , , , or , with snapshot history and structured per-field diffs so notifications can be wired straight into downstream tools.
samenewchangedremovederror检测网站内容何时发生变化,并通过webhook或邮件接收通知。每次检查中的每个页面都会标记为 (无变化)、(新增)、(已变更)、(已移除)或 (错误),同时提供快照历史和结构化字段级差异,可直接将通知接入下游工具。
samenewchangedremovederrorWhen to use
使用场景
- The user wants to know when something changes — and be notified about it — not just read what the page says right now
- Ongoing change detection on any URL: pricing, docs, changelogs, blogs, job boards, status pages, competitor sites, regulatory pages, product availability, hiring pages, top-N rankings (HN, leaderboards, etc.)
- "Alert me when...", "notify me when...", "email me if...", "send a webhook when...", "ping me if X changes", "track this page"
- Anywhere the user would otherwise wire up cron + a scraper + a diff library + SMTP themselves
- Step 5 in the workflow escalation pattern: search → scrape → map → crawl → monitor → interact
Bias toward whenever the request implies notifications or recurrence. A single page read once = . A single page where the user wants to be told when it changes = .
monitorscrapemonitor --page <url> --goal "..." --email|--webhook-url ...- 用户想要知道某事何时发生变化,并希望得到通知,而不只是查看页面当前的内容
- 对任意URL进行持续变更检测:定价、文档、变更日志、博客、招聘网站、状态页面、竞品网站、监管页面、产品库存、招聘页面、Top-N排名(如HN、排行榜等)
- “提醒我当……”“通知我当……”“如果……发邮件给我”“当……时发送webhook”“如果X变化请ping我”“跟踪这个页面”
- 任何用户原本需要自行搭建cron+爬虫+差异库+SMTP的场景
- 工作流升级模式的第5步:搜索→爬取→映射→抓取→监控→交互
当请求涉及通知或定期检查时,优先推荐。一次性读取单个页面 = 。用户希望在单个页面变化时收到通知 = 。
monitorscrapemonitor --page <url> --goal "..." --email|--webhook-url ...Why use a monitor
为何使用监控工具
- Change-detection-as-a-service. Firecrawl handles fetching, diffing, judging, and notifying — all server-side. No cron, no diff library, no SMTP setup, no snapshot DB to manage.
- Notifications first. Webhooks (as each page finishes,
monitor.pageafter the check is reconciled) and email summaries that only fire when something actually changed or errored. External recipients confirm via per-recipient opt-in.monitor.check.completed - AI noise filter via . Set a plain-language goal and the change judge ignores formatting, whitespace, casing, punctuation, encoding, request/session IDs, cache busters, tracking params, generic metadata, and unrelated page chrome — so notifications are about content the user actually cares about, not page churn.
--goal - Structured per-field diffs. JSON-mode change tracking returns keyed diffs like instead of a wall of unified diff. Drops straight into a Slack message, CI step, or internal tool.
plans[0].price: "$19/mo" → "$24/mo" - Simple page-status model. Each page in a check returns ,
same,new,changed, orremoved. Easy to filter, easy to act on.error - Snapshot history without infra. Point-in-time snapshots are kept for diffing via ; no storage to provision.
--retention-days - Watch many things at once. One monitor can watch many pages or diff every page discovered by a recurring site crawl.
- No scheduling glue. Cron normalization and are computed for you, with natural-language schedules supported (
nextRunAt,"every 30 minutes","hourly")."daily at 9:00"
- 变更检测即服务:Firecrawl 负责处理抓取、差异对比、判断和通知——全部在服务端完成。无需cron任务、无需差异库、无需SMTP配置、无需管理快照数据库。
- 优先支持通知:Webhook(每个页面完成时触发,检查完成后触发
monitor.page)和邮件摘要仅在内容实际变化或出现错误时触发。外部收件人需通过每位收件人的确认链接选择接收通知。monitor.check.completed - 通过实现AI降噪:设置自然语言目标后,变更判断器会忽略格式、空白字符、大小写、标点、编码、请求/会话ID、缓存清除参数、跟踪参数、通用元数据和无关页面框架——确保通知仅针对用户真正关心的内容,而非页面的无关变动。
--goal - 结构化字段级差异:JSON模式的变更跟踪会返回类似的键值对差异,而非大片的统一差异文本。可直接集成到Slack消息、CI步骤或内部工具中。
plans[0].price: "$19/mo" → "$24/mo" - 简洁的页面状态模型:每次检查中的每个页面都会返回、
same、new、changed或removed状态。便于筛选和处理。error - 无需基础设施的快照历史:通过设置保留用于差异对比的时间点快照;无需配置存储。
--retention-days - 同时监控多个对象:一个监控任务可以监控多个页面,或者对定期网站抓取发现的每个页面进行差异对比。
- 无需调度胶水代码:自动计算Cron规范化配置和时间,支持自然语言调度(如
nextRunAt、"every 30 minutes"、"hourly")。"daily at 9:00"
Quick start
快速开始
bash
undefinedbash
undefinedSingle page, natural-language schedule, email alert
单页面、自然语言调度、邮件提醒
firecrawl monitor create --name "Blog" --schedule "every 30 minutes"
--goal "Alert when a new blog post is published."
--page https://example.com/blog
--email alerts@example.com
--goal "Alert when a new blog post is published."
--page https://example.com/blog
--email alerts@example.com
firecrawl monitor create --name "Blog" --schedule "every 30 minutes"
--goal "Alert when a new blog post is published."
--page https://example.com/blog
--email alerts@example.com
--goal "Alert when a new blog post is published."
--page https://example.com/blog
--email alerts@example.com
Multiple pages, one monitor
多页面、单个监控任务
firecrawl monitor create --name "Product pages" --schedule "every 30 minutes"
--goal "Alert when pricing, docs, or changelog content changes."
--scrape-urls https://example.com/pricing,https://example.com/docs,https://example.com/changelog
--goal "Alert when pricing, docs, or changelog content changes."
--scrape-urls https://example.com/pricing,https://example.com/docs,https://example.com/changelog
firecrawl monitor create --name "Product pages" --schedule "every 30 minutes"
--goal "Alert when pricing, docs, or changelog content changes."
--scrape-urls https://example.com/pricing,https://example.com/docs,https://example.com/changelog
--goal "Alert when pricing, docs, or changelog content changes."
--scrape-urls https://example.com/pricing,https://example.com/docs,https://example.com/changelog
Whole-site crawl per check (every discovered page is diffed)
每次检查抓取整个站点(发现的每个页面都会进行差异对比)
firecrawl monitor create --name "Docs site" --schedule "hourly"
--goal "Alert when any docs page is added, removed, or substantively changed."
--crawl-url https://docs.example.com
--goal "Alert when any docs page is added, removed, or substantively changed."
--crawl-url https://docs.example.com
firecrawl monitor create --name "Docs site" --schedule "hourly"
--goal "Alert when any docs page is added, removed, or substantively changed."
--crawl-url https://docs.example.com
--goal "Alert when any docs page is added, removed, or substantively changed."
--crawl-url https://docs.example.com
Webhook notifications
Webhook通知
firecrawl monitor create --name "Docs webhook" --schedule "every 30 minutes"
--goal "Alert when docs content changes."
--page https://example.com/docs
--webhook-url https://example.com/hook
--webhook-events monitor.page,monitor.check.completed
--goal "Alert when docs content changes."
--page https://example.com/docs
--webhook-url https://example.com/hook
--webhook-events monitor.page,monitor.check.completed
firecrawl monitor create --name "Docs webhook" --schedule "every 30 minutes"
--goal "Alert when docs content changes."
--page https://example.com/docs
--webhook-url https://example.com/hook
--webhook-events monitor.page,monitor.check.completed
--goal "Alert when docs content changes."
--page https://example.com/docs
--webhook-url https://example.com/hook
--webhook-events monitor.page,monitor.check.completed
Manage and inspect
管理和查看
firecrawl monitor list --limit 20
firecrawl monitor get <monitorId>
firecrawl monitor run <monitorId> # trigger a check now
firecrawl monitor checks <monitorId> # list all checks
firecrawl monitor check <monitorId> <checkId> --page-status changed
firecrawl monitor update <monitorId> --state paused
firecrawl monitor delete <monitorId>
Subcommands: `create | list | get | update | delete | run | checks | check`.firecrawl monitor list --limit 20
firecrawl monitor get <monitorId>
firecrawl monitor run <monitorId> # 立即触发一次检查
firecrawl monitor checks <monitorId> # 列出所有检查记录
firecrawl monitor check <monitorId> <checkId> --page-status changed
firecrawl monitor update <monitorId> --state paused
firecrawl monitor delete <monitorId>
子命令:`create | list | get | update | delete | run | checks | check`。Options
选项
| Option | Description |
|---|---|
| Monitor name (required on create) |
| Plain-language change goal (auto-enables the AI change judge) |
| Natural-language schedule ( |
| Cron schedule (e.g. |
| Schedule timezone (default: |
| Single page URL to scrape on each check |
| Comma-separated URLs to scrape on each check |
| Root URL for a crawl target (every discovered page gets diffed) |
| Webhook destination |
| |
| Comma-separated email recipients |
| Snapshot retention window |
| |
| Filter |
| Output file path |
| Pretty-print JSON output |
Minimum schedule interval is 15 minutes. Monitoring is not available for zero-data-retention teams.
| 选项 | 说明 |
|---|---|
| 监控任务名称(创建时必填) |
| 自然语言描述的变更目标(自动启用AI变更判断器) |
| 自然语言调度规则( |
| Cron调度表达式(例如 |
| 调度时区(默认: |
| 每次检查时要爬取的单个页面URL |
| 每次检查时要爬取的URL列表(逗号分隔) |
| 抓取目标的根URL(发现的每个页面都会进行差异对比) |
| Webhook目标地址 |
| 触发Webhook的事件: |
| 收件人邮箱列表(逗号分隔) |
| 快照保留天数 |
| 任务状态: |
| 筛选检查结果的页面状态: |
| 输出文件路径 |
| 格式化输出JSON内容 |
最小调度间隔为15分钟。零数据保留团队无法使用监控功能。
Writing a good --goal
--goal编写有效的--goal
--goalThe goal is what the AI change judge uses to decide whether a page is vs . Convert the user's intent into a concise 2-3 sentence goal:
changedsame- Start with and state the trigger using the user's wording.
Alert when ... - Restate any scope they mentioned: top N, price, role type, region, company, topic, status, or a specific entity.
- Add an sentence only for intent-specific exclusions (e.g. points/comments for rankings, marketing copy for pricing, general company-page updates for job listings).
Ignore ... - Do not repeat generic noise exclusions — the judge already handles whitespace, casing, punctuation, encoding, formatting-only changes, request/session IDs, cache busters, tracking params, generic metadata noise, and unrelated page chrome.
- Don't invent page-specific sections, entities, thresholds, exclusions, or business rules unless the user mentioned them.
- If the user is vague or asks for "any change", keep the goal broad and don't add exclusions.
| User says | Good goal |
|---|---|
| |
| |
| |
| |
| |
目标是AI变更判断器用于区分页面是还是的依据。将用户的意图转化为简洁的2-3句话目标:
changedsame- 以开头,使用用户的表述说明触发条件。
Alert when ... - 重述用户提到的任何范围:Top N、价格、职位类型、地区、公司、主题、状态或特定实体。
- 仅在需要针对特定意图排除内容时添加语句(例如,排名中的点赞数/评论数、定价页面中的无关营销文案、招聘列表中的通用公司页面更新)。
Ignore ... - 不要重复通用的排除规则——判断器已自动处理空白字符、大小写、标点、编码、仅格式变化、请求/会话ID、缓存清除参数、跟踪参数、通用元数据噪声和无关页面框架。
- 除非用户提及,否则不要自行添加页面特定的部分、实体、阈值、排除规则或业务逻辑。
- 如果用户表述模糊或要求“任何变化”,则保持目标宽泛,不要添加排除规则。
| 用户表述 | 有效目标 |
|---|---|
| |
| |
| |
| |
| |
JSON-mode change tracking (structured per-field diffs)
JSON模式变更跟踪(结构化字段级差异)
By default monitors diff each page's markdown and return a unified text diff. When the user cares about specific structured fields (price, headline, in-stock flag, items in a list), use JSON-mode change tracking. The CLI flags don't cover this — pass a JSON body via positional file or piped stdin:
bash
cat > pricing-monitor.json <<'EOF'
{
"name": "Pricing watch",
"goal": "Alert when plan prices or headline features change.",
"schedule": { "text": "hourly", "timezone": "UTC" },
"targets": [{
"type": "scrape",
"urls": ["https://example.com/pricing"],
"scrapeOptions": {
"formats": [{
"type": "changeTracking",
"modes": ["json"],
"prompt": "Extract pricing tiers and headline features for each plan.",
"schema": {
"type": "object",
"properties": {
"plans": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "string" },
"features": { "type": "array", "items": { "type": "string" } }
}
}
}
}
}
}]
}
}]
}
EOF
firecrawl monitor create pricing-monitor.json默认情况下,监控工具会对比每个页面的markdown内容并返回统一文本差异。当用户关注特定结构化字段(价格、标题、库存状态、列表项)时,使用JSON模式变更跟踪。CLI标志不支持此功能——需通过位置文件或管道输入的标准输入传递JSON体:
bash
cat > pricing-monitor.json <<'EOF'
{
"name": "Pricing watch",
"goal": "Alert when plan prices or headline features change.",
"schedule": { "text": "hourly", "timezone": "UTC" },
"targets": [{
"type": "scrape",
"urls": ["https://example.com/pricing"],
"scrapeOptions": {
"formats": [{
"type": "changeTracking",
"modes": ["json"],
"prompt": "Extract pricing tiers and headline features for each plan.",
"schema": {
"type": "object",
"properties": {
"plans": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "string" },
"features": { "type": "array", "items": { "type": "string" } }
}
}
}
}
}
}]
}
}]
}
EOF
firecrawl monitor create pricing-monitor.jsonor: cat pricing-monitor.json | firecrawl monitor create
or: cat pricing-monitor.json | firecrawl monitor create
Each changed page in the check response then carries a per-field diff plus a snapshot of the current full extraction:
```json
{
"url": "https://example.com/pricing",
"status": "changed",
"diff": {
"json": {
"plans[0].price": { "previous": "$19/mo", "current": "$24/mo" },
"plans[1].features[2]": {
"previous": "10 GB storage",
"current": "25 GB storage"
}
}
},
"snapshot": {
"json": {
"plans": [
/* current full extraction */
]
}
}
}Use for mixed mode — you get both (per-field) and (markdown sidecar), and the page is marked whenever either surface changed.
modes: ["json", "git-diff"]diff.jsondiff.textchanged
检查响应中每个已变更的页面会携带字段级差异以及当前完整提取内容的快照:
```json
{
"url": "https://example.com/pricing",
"status": "changed",
"diff": {
"json": {
"plans[0].price": { "previous": "$19/mo", "current": "$24/mo" },
"plans[1].features[2]": {
"previous": "10 GB storage",
"current": "25 GB storage"
}
}
},
"snapshot": {
"json": {
"plans": [
/* current full extraction */
]
}
}
}使用启用混合模式——你会同时获得(字段级)和(markdown辅助差异),只要其中任意一种检测到变化,页面就会标记为。
modes: ["json", "git-diff"]diff.jsondiff.textchangedTips
小贴士
- Prefer one monitor over repeated one-off scrapes whenever the user wants the same URL checked more than once.
- Use (via
--state paused), notupdate, when temporarily silencing a monitor.delete - controls how long snapshots are kept for diffing. Lower it for high-frequency monitors to save storage.
--retention-days - External email recipients must opt in. First time they're added, Firecrawl sends a confirmation email and they only receive alerts after they confirm. Team-owned email addresses are auto-confirmed. Once a recipient unsubscribes, they must be re-added by the owner to get a fresh confirmation email.
- triggers a check immediately — useful for smoke-testing a monitor right after creating it without waiting for the next scheduled run.
firecrawl monitor run <id> - Filter check pages with (or
--page-status changed,new,removed) to skip the noise fromerrorpages.same - Use (not
--page-status) when filtering check pages —--statusis reserved for the global CLI status flag.--status - Monitor-triggered scrapes default to
maxAge— every check performs a fresh scrape unless0is set explicitly in a JSON payload.scrapeOptions.maxAge
- 当用户需要多次检查同一URL时,优先使用一个监控任务而非重复的一次性爬取。
- 临时暂停监控任务时,使用命令的
update参数,而非--state paused命令。delete - ****控制用于差异对比的快照保留时长。对于高频监控任务,降低此值以节省存储空间。
--retention-days - 外部邮件收件人必须选择接收。首次添加时,Firecrawl会发送确认邮件,收件人确认后才会收到提醒。团队自有邮箱地址会自动确认。一旦收件人取消订阅,必须由所有者重新添加才能获取新的确认邮件。
- ****会立即触发一次检查——便于创建监控任务后立即进行冒烟测试,无需等待下一次调度运行。
firecrawl monitor run <id> - 使用(或
--page-status changed、new、removed)筛选检查页面,以跳过error页面的无关信息。same - 筛选检查页面时使用(而非
--page-status)——--status是全局CLI状态标志的保留参数。--status - 监控触发的爬取默认为
maxAge——除非在JSON payload中明确设置0,否则每次检查都会执行全新的爬取。scrapeOptions.maxAge
See also
另请参阅
- firecrawl-scrape — one-off scrape; escalate to when checks become recurring
monitor - firecrawl-crawl — one-off crawl; pair with here for recurring crawl diffs
--crawl-url - firecrawl-cli — top-level workflow guide
- firecrawl-scrape — 一次性爬取;当检查变为定期任务时升级为
monitor - firecrawl-crawl — 一次性抓取;搭配此处的实现定期抓取差异对比
--crawl-url - firecrawl-cli — 顶级工作流指南