fetch-url

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Fetch URL

抓取URL

Fetch a single URL and convert its content to clean Markdown. This does not add anything to the documentation index; it is a one-shot read operation.
抓取单个URL并将其内容转换为整洁的Markdown格式。此操作不会将任何内容添加到文档索引中,是一次性读取操作。

When to use

使用场景

  • You need to read a single web page and get its content as Markdown.
  • You want to inspect a documentation page before deciding whether to index it.
  • You need the raw text of a URL for summarisation or analysis.
If you need to index the content for repeated searching, use the docs-manage skill (
scrape
command) instead.
  • 你需要读取单个网页并获取其Markdown格式的内容。
  • 你想在决定是否建立索引前先查看某个文档页面。
  • 你需要URL的原始文本用于摘要生成或分析。
如果你需要为重复搜索建立内容索引,请使用docs-manage技能(
scrape
命令)替代。

Command

命令

bash
npx @arabold/docs-mcp-server@latest fetch-url <url> [options]
FlagDefaultDescription
--follow-redirects
true
Follow HTTP redirects
--no-follow-redirects
Disable following redirects
--scrape-mode auto|fetch|playwright
auto
HTML processing strategy
--header "Name: Value"
Custom HTTP header (repeatable)
--quiet
Suppress non-error diagnostics
--verbose
Enable debug logging
bash
npx @arabold/docs-mcp-server@latest fetch-url <url> [options]
标志默认值描述
--follow-redirects
true
跟随HTTP重定向
--no-follow-redirects
禁用重定向跟随
--scrape-mode auto|fetch|playwright
auto
HTML处理策略
--header "Name: Value"
自定义HTTP请求头(可重复使用)
--quiet
抑制非错误诊断信息
--verbose
启用调试日志

Scrape modes

抓取模式

ModeWhen to use
auto
Default. Tries a simple HTTP fetch first, falls back to Playwright for JS-rendered pages.
fetch
Force a plain HTTP fetch. Fastest, but misses content rendered by JavaScript.
playwright
Force a headless browser. Use for SPAs or pages that require JavaScript to render.
模式使用场景
auto
默认模式。先尝试简单的HTTP抓取,若失败则回退到Playwright处理JavaScript渲染的页面。
fetch
强制使用纯HTTP抓取。速度最快,但无法获取JavaScript渲染的内容。
playwright
强制使用无头浏览器。适用于单页应用(SPAs)或需要JavaScript渲染内容的页面。

Examples

示例

bash
undefined
bash
undefined

Fetch a documentation page

抓取文档页面

npx @arabold/docs-mcp-server@latest fetch-url https://react.dev/reference/react/useEffect
npx @arabold/docs-mcp-server@latest fetch-url https://react.dev/reference/react/useEffect

Fetch with custom auth header

使用自定义认证请求头抓取

npx @arabold/docs-mcp-server@latest fetch-url https://docs.internal.com/api
--header "Authorization: Bearer tok_xxx"
npx @arabold/docs-mcp-server@latest fetch-url https://docs.internal.com/api
--header "Authorization: Bearer tok_xxx"

Force Playwright for a JS-heavy page

强制使用Playwright抓取JavaScript密集型页面

npx @arabold/docs-mcp-server@latest fetch-url https://some-spa.dev/docs --scrape-mode playwright
npx @arabold/docs-mcp-server@latest fetch-url https://some-spa.dev/docs --scrape-mode playwright

Disable redirect following

禁用重定向跟随

npx @arabold/docs-mcp-server@latest fetch-url https://example.com/old-page --no-follow-redirects
undefined
npx @arabold/docs-mcp-server@latest fetch-url https://example.com/old-page --no-follow-redirects
undefined

Output

输出

The command writes the converted Markdown text directly to stdout. The global
--output
flag is accepted but has no effect because the result is already plain text, not structured data.
Diagnostics and errors go to stderr and are suppressed by default in non-interactive sessions. Use
--verbose
(or set
LOG_LEVEL=INFO
) to re-enable them. Use
--quiet
to suppress all non-error diagnostics regardless of session type.
该命令会将转换后的Markdown文本直接写入stdout。全局
--output
标志可被接受但无实际效果,因为结果已经是纯文本而非结构化数据。
诊断信息和错误会输出到stderr,默认在非交互式会话中被抑制。使用
--verbose
(或设置
LOG_LEVEL=INFO
)可重新启用这些信息。使用
--quiet
可在任何会话类型中抑制所有非错误诊断信息。

Tips

小贴士

  • Pipe the output to a file if you want to save it:
    npx @arabold/docs-mcp-server@latest fetch-url <url> > page.md
  • Combine with search: fetch a page to read its full content after
    search
    returns a relevant URL.
  • For pages behind authentication, use
    --header
    to pass cookies or tokens.
  • 若要保存输出内容,可将结果管道输出到文件:
    npx @arabold/docs-mcp-server@latest fetch-url <url> > page.md
  • 与搜索功能结合使用:在
    search
    返回相关URL后,抓取页面以读取完整内容。
  • 对于需要认证的页面,使用
    --header
    传递Cookie或令牌。