fetch-url
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFetch URL
抓取URL
Fetch a single URL and convert its content to clean Markdown. This does not
add anything to the documentation index; it is a one-shot read operation.
抓取单个URL并将其内容转换为整洁的Markdown格式。此操作不会将任何内容添加到文档索引中,是一次性读取操作。
When to use
使用场景
- You need to read a single web page and get its content as Markdown.
- You want to inspect a documentation page before deciding whether to index it.
- You need the raw text of a URL for summarisation or analysis.
If you need to index the content for repeated searching, use the docs-manage
skill ( command) instead.
scrape- 你需要读取单个网页并获取其Markdown格式的内容。
- 你想在决定是否建立索引前先查看某个文档页面。
- 你需要URL的原始文本用于摘要生成或分析。
如果你需要为重复搜索建立内容索引,请使用docs-manage技能(命令)替代。
scrapeCommand
命令
bash
npx @arabold/docs-mcp-server@latest fetch-url <url> [options]| Flag | Default | Description |
|---|---|---|
| | Follow HTTP redirects |
| Disable following redirects | |
| | HTML processing strategy |
| Custom HTTP header (repeatable) | |
| Suppress non-error diagnostics | |
| Enable debug logging |
bash
npx @arabold/docs-mcp-server@latest fetch-url <url> [options]| 标志 | 默认值 | 描述 |
|---|---|---|
| | 跟随HTTP重定向 |
| 禁用重定向跟随 | |
| | HTML处理策略 |
| 自定义HTTP请求头(可重复使用) | |
| 抑制非错误诊断信息 | |
| 启用调试日志 |
Scrape modes
抓取模式
| Mode | When to use |
|---|---|
| Default. Tries a simple HTTP fetch first, falls back to Playwright for JS-rendered pages. |
| Force a plain HTTP fetch. Fastest, but misses content rendered by JavaScript. |
| Force a headless browser. Use for SPAs or pages that require JavaScript to render. |
| 模式 | 使用场景 |
|---|---|
| 默认模式。先尝试简单的HTTP抓取,若失败则回退到Playwright处理JavaScript渲染的页面。 |
| 强制使用纯HTTP抓取。速度最快,但无法获取JavaScript渲染的内容。 |
| 强制使用无头浏览器。适用于单页应用(SPAs)或需要JavaScript渲染内容的页面。 |
Examples
示例
bash
undefinedbash
undefinedFetch a documentation page
抓取文档页面
npx @arabold/docs-mcp-server@latest fetch-url https://react.dev/reference/react/useEffect
npx @arabold/docs-mcp-server@latest fetch-url https://react.dev/reference/react/useEffect
Fetch with custom auth header
使用自定义认证请求头抓取
npx @arabold/docs-mcp-server@latest fetch-url https://docs.internal.com/api
--header "Authorization: Bearer tok_xxx"
--header "Authorization: Bearer tok_xxx"
npx @arabold/docs-mcp-server@latest fetch-url https://docs.internal.com/api
--header "Authorization: Bearer tok_xxx"
--header "Authorization: Bearer tok_xxx"
Force Playwright for a JS-heavy page
强制使用Playwright抓取JavaScript密集型页面
npx @arabold/docs-mcp-server@latest fetch-url https://some-spa.dev/docs --scrape-mode playwright
npx @arabold/docs-mcp-server@latest fetch-url https://some-spa.dev/docs --scrape-mode playwright
Disable redirect following
禁用重定向跟随
npx @arabold/docs-mcp-server@latest fetch-url https://example.com/old-page --no-follow-redirects
undefinednpx @arabold/docs-mcp-server@latest fetch-url https://example.com/old-page --no-follow-redirects
undefinedOutput
输出
The command writes the converted Markdown text directly to stdout. The
global flag is accepted but has no effect because the result is
already plain text, not structured data.
--outputDiagnostics and errors go to stderr and are suppressed by default in
non-interactive sessions. Use (or set ) to
re-enable them. Use to suppress all non-error diagnostics
regardless of session type.
--verboseLOG_LEVEL=INFO--quiet该命令会将转换后的Markdown文本直接写入stdout。全局标志可被接受但无实际效果,因为结果已经是纯文本而非结构化数据。
--output诊断信息和错误会输出到stderr,默认在非交互式会话中被抑制。使用(或设置)可重新启用这些信息。使用可在任何会话类型中抑制所有非错误诊断信息。
--verboseLOG_LEVEL=INFO--quietTips
小贴士
- Pipe the output to a file if you want to save it:
npx @arabold/docs-mcp-server@latest fetch-url <url> > page.md - Combine with search: fetch a page to read its full content after returns a relevant URL.
search - For pages behind authentication, use to pass cookies or tokens.
--header
- 若要保存输出内容,可将结果管道输出到文件:
npx @arabold/docs-mcp-server@latest fetch-url <url> > page.md - 与搜索功能结合使用:在返回相关URL后,抓取页面以读取完整内容。
search - 对于需要认证的页面,使用传递Cookie或令牌。
--header