xcrawl-map

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

XCrawl Map

Overview

概述

This skill uses XCrawl Map API to discover URLs for a site. Default behavior is raw passthrough: return upstream API response bodies as-is.

本技能调用XCrawl Map API来发现站点的所有URL。默认行为是原始透传：原样返回上游API的响应体。

Required Local Config

本地必备配置

Before using this skill, the user must create a local config file and write

XCRAWL_API_KEY

into it.

Path:

~/.xcrawl/config.json

json

{
  "XCRAWL_API_KEY": "<your_api_key>"
}

Read API key from local config file only. Do not require global environment variables.

使用本技能前，用户必须创建本地配置文件并写入

XCRAWL_API_KEY

。

路径:

~/.xcrawl/config.json

json

{
  "XCRAWL_API_KEY": "<your_api_key>"
}

仅从本地配置文件读取API密钥，无需设置全局环境变量。

Credits and Account Setup

额度与账号设置

Using XCrawl APIs consumes credits. If the user does not have an account or available credits, guide them to register at

https://dash.xcrawl.com/

. After registration, they can activate the free

credits plan before running requests.

调用XCrawl API会消耗额度。如果用户没有账号或者可用额度，引导其前往

https://dash.xcrawl.com/

额度套餐。

Tool Permission Policy

工具权限策略

Request runtime permissions for

curl

and

node

only. Do not request Python, shell helper scripts, or other runtime permissions.

仅申请

curl

和

node

的运行时权限，不要申请Python、shell辅助脚本或其他运行时权限。

API Surface

API接口

Start map task:
```
POST /v1/map
```
Base URL:
```
https://run.xcrawl.com
```
Required header:
```
Authorization: Bearer <XCRAWL_API_KEY>
```

启动map任务:
```
POST /v1/map
```
基础URL:
```
https://run.xcrawl.com
```
必填请求头:
```
Authorization: Bearer <XCRAWL_API_KEY>
```

Usage Examples

使用示例

cURL

bash

API_KEY="$(node -e "const fs=require('fs');const p=process.env.HOME+'/.xcrawl/config.json';const k=JSON.parse(fs.readFileSync(p,'utf8')).XCRAWL_API_KEY||'';process.stdout.write(k)")"

curl -sS -X POST "https://run.xcrawl.com/v1/map" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{"url":"https://example.com","filter":"/docs/.*","limit":2000,"include_subdomains":true,"ignore_query_parameters":false}'

bash

API_KEY="$(node -e "const fs=require('fs');const p=process.env.HOME+'/.xcrawl/config.json';const k=JSON.parse(fs.readFileSync(p,'utf8')).XCRAWL_API_KEY||'';process.stdout.write(k)")"

curl -sS -X POST "https://run.xcrawl.com/v1/map" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{"url":"https://example.com","filter":"/docs/.*","limit":2000,"include_subdomains":true,"ignore_query_parameters":false}'

Node

bash

node -e '
const fs=require("fs");
const apiKey=JSON.parse(fs.readFileSync(process.env.HOME+"/.xcrawl/config.json","utf8")).XCRAWL_API_KEY;
const body={url:"https://example.com",filter:"/docs/.*",limit:3000,include_subdomains:true,ignore_query_parameters:false};
fetch("https://run.xcrawl.com/v1/map",{
  method:"POST",
  headers:{"Content-Type":"application/json",Authorization:`Bearer ${apiKey}`},
  body:JSON.stringify(body)
}).then(async r=>{console.log(await r.text());});
'

bash

node -e '
const fs=require("fs");
const apiKey=JSON.parse(fs.readFileSync(process.env.HOME+"/.xcrawl/config.json","utf8")).XCRAWL_API_KEY;
const body={url:"https://example.com",filter:"/docs/.*",limit:3000,include_subdomains:true,ignore_query_parameters:false};
fetch("https://run.xcrawl.com/v1/map",{
  method:"POST",
  headers:{"Content-Type":"application/json",Authorization:`Bearer ${apiKey}`},
  body:JSON.stringify(body)
}).then(async r=>{console.log(await r.text());});
'

Request Parameters

请求参数

Request endpoint and headers

请求端点与请求头

Endpoint:
```
POST https://run.xcrawl.com/v1/map
```
Headers:
```
Content-Type: application/json
```
```
Authorization: Bearer <api_key>
```

端点:
```
POST https://run.xcrawl.com/v1/map
```
请求头:
```
Content-Type: application/json
```
```
Authorization: Bearer <api_key>
```

Request body: top-level fields

请求体: 顶级字段

Field	Type	Required	Default	Description
`url`	string	Yes	-	Site entry URL
`filter`	string	No	-	Regex filter for URLs
`limit`	integer	No	`5000`	Max URLs (up to `100000` )
`include_subdomains`	boolean	No	`true`	Include subdomains
`ignore_query_parameters`	boolean	No	`true`	Ignore URLs with query parameters

字段	类型	必填	默认值	描述
`url`	string	是	-	站点入口URL
`filter`	string	否	-	URL的正则过滤规则
`limit`	integer	否	`5000`	最大返回URL数量（最高可设为 `100000` ）
`include_subdomains`	boolean	否	`true`	是否包含子域名
`ignore_query_parameters`	boolean	否	`true`	是否忽略带查询参数的URL

Response Parameters

响应参数

Field	Type	Description
`map_id`	string	Task ID
`endpoint`	string	Always `map`
`version`	string	Version
`status`	string	`completed`
`url`	string	Entry URL
`data`	object	URL list data
`started_at`	string	Start time (ISO 8601)
`ended_at`	string	End time (ISO 8601)
`total_credits_used`	integer	Total credits used

data

fields:

```
links
```
: URL list
```
total_links
```
: URL count
```
credits_used
```
: credits used
```
credits_detail
```
: credit breakdown

字段	类型	描述
`map_id`	string	任务ID
`endpoint`	string	固定为 `map`
`version`	string	版本号
`status`	string	状态，值为 `completed` 表示完成
`url`	string	入口URL
`data`	object	URL列表数据
`started_at`	string	任务启动时间（ISO 8601格式）
`ended_at`	string	任务结束时间（ISO 8601格式）
`total_credits_used`	integer	消耗的总额度

data

字段说明:

```
links
```
: URL列表
```
total_links
```
: URL总数
```
credits_used
```
: 消耗的额度
```
credits_detail
```
: 额度消耗明细

Workflow

工作流程

Restate mapping objective.

Discovery only, selective crawl planning, or structure analysis.

Build and execute
```
POST /v1/map
```
.

Keep filters explicit and reproducible.

Return raw API response directly.

Do not synthesize URL-family summaries unless requested.

明确mapping任务目标：仅URL发现、定向爬取规划或结构分析。
构造并执行
```
POST /v1/map
```
请求：确保过滤规则明确可复现。
直接返回原始API响应：除非用户要求，否则不要生成URL分类汇总。

Output Contract

输出约定

Return:

Endpoint used (
```
POST /v1/map
```
)
```
request_payload
```
used for the request
Raw response body from map call
Error details when request fails

Do not generate summaries unless the user explicitly requests a summary.

返回内容需包含:

使用的接口地址（
```
POST /v1/map
```
）
请求使用的
```
request_payload
```
map接口返回的原始响应体
请求失败时的错误详情

除非用户明确要求生成汇总，否则不要额外生成总结内容。

Guardrails

使用规则

Do not claim full site coverage if
```
limit
```
is reached.
Do not mix inferred URLs with returned URLs.
Do not hardcode provider-specific tool schemas in core logic.

如果触发了
```
limit
```
上限，不要宣称实现了全站覆盖。
不要将推断的URL和接口返回的URL混合展示。
不要在核心逻辑中硬编码服务商专属的工具Schema。