web-doc-resolver

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Web Documentation Resolver

网页文档解析器

Resolve query or URL inputs into compact, high-signal markdown for agents and RAG systems using an intelligent cascade.

通过智能级联流程，将查询内容或URL输入转换为紧凑、高信息密度的Markdown格式，供Agent和RAG系统使用。

When to Use This Skill

何时使用该Skill

Activate this skill when you need to:

Fetch and parse documentation from a URL
Search for technical information across the web
Build context from web sources
Extract markdown from websites
Query for technical documentation, APIs, or code examples

在以下场景激活该Skill：

从URL获取并解析文档
在全网搜索技术信息
从网页来源构建上下文
从网站提取Markdown内容
查询技术文档、API或代码示例

Platform Tool Mapping

平台工具映射

This skill works across multiple platforms. Use the appropriate tools for your platform:

Platform	Fetch Tool	Search Tool
opencode	`webfetch`	`websearch`
claude code	`WebFetch` (MCP)	`WebSearch` (MCP)
blackbox	`web_fetch`	`web_search`
Python script	Auto-detects available tools	Auto-detects available tools

该Skill支持多平台使用，请根据你的平台选择合适的工具：

平台	抓取工具	搜索工具
opencode	`webfetch`	`websearch`
claude code	`WebFetch` (MCP)	`WebSearch` (MCP)
blackbox	`web_fetch`	`web_search`
Python脚本	自动检测可用工具	自动检测可用工具

Cascade Resolution Strategy

级联解析策略

For URL inputs

针对URL输入

Use this cascade (in order):

Check llms.txt first: Probe
```
https://origin/llms.txt
```
for site-provided structured documentation (free, always check first)
Fetch URL: Use platform's fetch tool to get markdown content
Search fallback: Use platform's search tool to find cached/mirrored versions if direct fetch fails

按以下顺序执行级联流程：

优先检查llms.txt：访问
```
https://origin/llms.txt
```
获取站点提供的结构化文档（免费，需优先检查）
抓取URL内容：使用平台的抓取工具获取Markdown格式内容
搜索回退：若直接抓取失败，使用平台的搜索工具查找缓存/镜像版本

For query inputs

针对查询输入

Use this cascade (in order):

Search first: Use platform's search tool with relevant query (fast, free)
Fetch top results: Use fetch tool to get markdown from top search results if needed

按以下顺序执行级联流程：

优先搜索：使用平台的搜索工具执行相关查询（快速、免费）
抓取顶部结果：若需要，使用抓取工具从搜索顶部结果中获取Markdown内容

Implementation

实现方式

Python Script (scripts/resolve.py)

Python脚本（scripts/resolve.py）

The skill includes a Python script that auto-detects available tools:

bash

undefined

该Skill包含一个Python脚本，可自动检测可用工具：

bash

undefined

Resolve a URL

python scripts/resolve.py "https://docs.rust-lang.org/book/"

Resolve a query

python scripts/resolve.py "Rust async programming"

JSON output

python scripts/resolve.py "query" --json

Custom max chars

python scripts/resolve.py "query" --max-chars 4000

Force specific backend

python scripts/resolve.py "query" --backend httpx

undefined

python scripts/resolve.py "query" --backend httpx

undefined

Direct Tool Usage by Platform

各平台直接工具调用

opencode

bash

undefined

bash

undefined

Check for llms.txt

webfetch https://example.com/llms.txt

Fetch URL

webfetch --format markdown https://docs.rust-lang.org/book/

Search

websearch "Rust book documentation"

undefined

websearch "Rust book documentation"

undefined

claude code (MCP)

python

undefined

python

undefined

Check for llms.txt

WebFetch(url="https://example.com/llms.txt")

Fetch URL

WebFetch(url="https://docs.rust-lang.org/book/")

Search

WebSearch(query="Rust book documentation")

undefined

WebSearch(query="Rust book documentation")

undefined

blackbox

python

undefined

python

undefined

Check for llms.txt

web_fetch(url="https://example.com/llms.txt", prompt="Extract all content")

Fetch URL

web_fetch(url="https://docs.rust-lang.org/book/", prompt="Extract main content")

Search

web_search(query="Rust book documentation")

undefined

web_search(query="Rust book documentation")

undefined

Usage Examples

使用示例

Basic URL Resolution

基础URL解析

bash

undefined

bash

undefined

Using Python script (auto-detects backend)

python scripts/resolve.py "https://docs.rust-lang.org/book/"

Or use platform tool directly

webfetch https://docs.rust-lang.org/book/ # opencode

undefined

webfetch https://docs.rust-lang.org/book/ # opencode

undefined

Query Resolution

查询解析

bash

undefined

bash

undefined

Using Python script

python scripts/resolve.py "Rust async programming best practices 2026"

Or use platform tool directly

websearch "Tokio runtime configuration options" # opencode

undefined

websearch "Tokio runtime configuration options" # opencode

undefined

Workflow for Building Context

上下文构建工作流

Check for llms.txt first: Probe
```
https://origin/llms.txt
```
Fetch content: Use fetch tool to get markdown from the URL
Search if needed: Use search tool for additional context or when fetch fails

优先检查llms.txt：访问
```
https://origin/llms.txt
```
抓取内容：使用抓取工具从URL获取Markdown内容
按需搜索：若抓取失败，使用搜索工具获取额外上下文

Best Practices

最佳实践

Check for llms.txt first: Many documentation sites have
```
/llms.txt
```
for structured content
Use specific queries: "rust tokio spawn vs spawn_blocking difference" gets better results than "rust tokio"
Filter by date: Add "2025" or "2026" to queries for current information
Prefer official docs: Always check official documentation first
Try multiple sources: If one URL fails, search for alternative mirrors

优先检查llms.txt：许多文档站点提供
```
/llms.txt
```
用于获取结构化内容
使用精准查询：例如“rust tokio spawn vs spawn_blocking difference”比“rust tokio”能获得更优结果
按日期筛选：在查询中加入“2025”或“2026”以获取最新信息
优先官方文档：始终优先检查官方文档
尝试多来源：若某个URL无法访问，搜索其他镜像站点

Quality Indicators

质量评估指标

Good content has:

Code examples with language markers
API signatures and type annotations
Configuration examples
Version information
Clear headings and structure

Poor content has:

Excessive boilerplate/navigation
Paywall blocks
Login requirements
Heavy advertising

优质内容具备：

带语言标记的代码示例
API签名和类型注解
配置示例
版本信息
清晰的标题和结构

劣质内容特征：

过多的冗余内容/导航元素
付费墙限制
登录要求
大量广告

Error Handling

错误处理

Provider failures should trigger cascade fallback
Use alternative sources when primary sources fail
Log errors for debugging
Fall back to search when direct fetch fails

若工具调用失败，触发级联回退流程
主来源失败时使用替代来源
记录错误用于调试
直接抓取失败时回退到搜索

Testing

测试

Run tests:

bash

cd .agents/skills/web-doc-resolver
python -m pytest tests/ -v

Run samples:

bash

python samples/sample_basic.py
python samples/sample_json.py

运行测试：

bash

cd .agents/skills/web-doc-resolver
python -m pytest tests/ -v

运行示例：

bash

python samples/sample_basic.py
python samples/sample_json.py

Files

文件说明

```
scripts/resolve.py
```
- Main implementation (multi-backend)
```
tests/test_resolve.py
```
- Unit tests
```
samples/sample_basic.py
```
- Basic usage examples
```
samples/sample_json.py
```
- JSON output examples
```
reference.md
```
- Detailed reference documentation

```
scripts/resolve.py
```
- 核心实现（支持多后端）
```
tests/test_resolve.py
```
- 单元测试
```
samples/sample_basic.py
```
- 基础使用示例
```
samples/sample_json.py
```
- JSON输出示例
```
reference.md
```
- 详细参考文档