web-doc-resolver
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWeb Documentation Resolver
网页文档解析器
Resolve query or URL inputs into compact, high-signal markdown for agents and RAG systems using an intelligent cascade.
通过智能级联流程,将查询内容或URL输入转换为紧凑、高信息密度的Markdown格式,供Agent和RAG系统使用。
When to Use This Skill
何时使用该Skill
Activate this skill when you need to:
- Fetch and parse documentation from a URL
- Search for technical information across the web
- Build context from web sources
- Extract markdown from websites
- Query for technical documentation, APIs, or code examples
在以下场景激活该Skill:
- 从URL获取并解析文档
- 在全网搜索技术信息
- 从网页来源构建上下文
- 从网站提取Markdown内容
- 查询技术文档、API或代码示例
Platform Tool Mapping
平台工具映射
This skill works across multiple platforms. Use the appropriate tools for your platform:
| Platform | Fetch Tool | Search Tool |
|---|---|---|
| opencode | | |
| claude code | | |
| blackbox | | |
| Python script | Auto-detects available tools | Auto-detects available tools |
该Skill支持多平台使用,请根据你的平台选择合适的工具:
| 平台 | 抓取工具 | 搜索工具 |
|---|---|---|
| opencode | | |
| claude code | | |
| blackbox | | |
| Python脚本 | 自动检测可用工具 | 自动检测可用工具 |
Cascade Resolution Strategy
级联解析策略
For URL inputs
针对URL输入
Use this cascade (in order):
- Check llms.txt first: Probe for site-provided structured documentation (free, always check first)
https://origin/llms.txt - Fetch URL: Use platform's fetch tool to get markdown content
- Search fallback: Use platform's search tool to find cached/mirrored versions if direct fetch fails
按以下顺序执行级联流程:
- 优先检查llms.txt:访问获取站点提供的结构化文档(免费,需优先检查)
https://origin/llms.txt - 抓取URL内容:使用平台的抓取工具获取Markdown格式内容
- 搜索回退:若直接抓取失败,使用平台的搜索工具查找缓存/镜像版本
For query inputs
针对查询输入
Use this cascade (in order):
- Search first: Use platform's search tool with relevant query (fast, free)
- Fetch top results: Use fetch tool to get markdown from top search results if needed
按以下顺序执行级联流程:
- 优先搜索:使用平台的搜索工具执行相关查询(快速、免费)
- 抓取顶部结果:若需要,使用抓取工具从搜索顶部结果中获取Markdown内容
Implementation
实现方式
Python Script (scripts/resolve.py)
Python脚本(scripts/resolve.py)
The skill includes a Python script that auto-detects available tools:
bash
undefined该Skill包含一个Python脚本,可自动检测可用工具:
bash
undefinedResolve a URL
Resolve a URL
python scripts/resolve.py "https://docs.rust-lang.org/book/"
python scripts/resolve.py "https://docs.rust-lang.org/book/"
Resolve a query
Resolve a query
python scripts/resolve.py "Rust async programming"
python scripts/resolve.py "Rust async programming"
JSON output
JSON output
python scripts/resolve.py "query" --json
python scripts/resolve.py "query" --json
Custom max chars
Custom max chars
python scripts/resolve.py "query" --max-chars 4000
python scripts/resolve.py "query" --max-chars 4000
Force specific backend
Force specific backend
python scripts/resolve.py "query" --backend httpx
undefinedpython scripts/resolve.py "query" --backend httpx
undefinedDirect Tool Usage by Platform
各平台直接工具调用
opencode
opencode
bash
undefinedbash
undefinedCheck for llms.txt
Check for llms.txt
webfetch https://example.com/llms.txt
webfetch https://example.com/llms.txt
Fetch URL
Fetch URL
webfetch --format markdown https://docs.rust-lang.org/book/
webfetch --format markdown https://docs.rust-lang.org/book/
Search
Search
websearch "Rust book documentation"
undefinedwebsearch "Rust book documentation"
undefinedclaude code (MCP)
claude code (MCP)
python
undefinedpython
undefinedCheck for llms.txt
Check for llms.txt
WebFetch(url="https://example.com/llms.txt")
WebFetch(url="https://example.com/llms.txt")
Fetch URL
Fetch URL
WebFetch(url="https://docs.rust-lang.org/book/")
WebFetch(url="https://docs.rust-lang.org/book/")
Search
Search
WebSearch(query="Rust book documentation")
undefinedWebSearch(query="Rust book documentation")
undefinedblackbox
blackbox
python
undefinedpython
undefinedCheck for llms.txt
Check for llms.txt
web_fetch(url="https://example.com/llms.txt", prompt="Extract all content")
web_fetch(url="https://example.com/llms.txt", prompt="Extract all content")
Fetch URL
Fetch URL
web_fetch(url="https://docs.rust-lang.org/book/", prompt="Extract main content")
web_fetch(url="https://docs.rust-lang.org/book/", prompt="Extract main content")
Search
Search
web_search(query="Rust book documentation")
undefinedweb_search(query="Rust book documentation")
undefinedUsage Examples
使用示例
Basic URL Resolution
基础URL解析
bash
undefinedbash
undefinedUsing Python script (auto-detects backend)
Using Python script (auto-detects backend)
python scripts/resolve.py "https://docs.rust-lang.org/book/"
python scripts/resolve.py "https://docs.rust-lang.org/book/"
Or use platform tool directly
Or use platform tool directly
webfetch https://docs.rust-lang.org/book/ # opencode
undefinedwebfetch https://docs.rust-lang.org/book/ # opencode
undefinedQuery Resolution
查询解析
bash
undefinedbash
undefinedUsing Python script
Using Python script
python scripts/resolve.py "Rust async programming best practices 2026"
python scripts/resolve.py "Rust async programming best practices 2026"
Or use platform tool directly
Or use platform tool directly
websearch "Tokio runtime configuration options" # opencode
undefinedwebsearch "Tokio runtime configuration options" # opencode
undefinedWorkflow for Building Context
上下文构建工作流
- Check for llms.txt first: Probe
https://origin/llms.txt - Fetch content: Use fetch tool to get markdown from the URL
- Search if needed: Use search tool for additional context or when fetch fails
- 优先检查llms.txt:访问
https://origin/llms.txt - 抓取内容:使用抓取工具从URL获取Markdown内容
- 按需搜索:若抓取失败,使用搜索工具获取额外上下文
Best Practices
最佳实践
- Check for llms.txt first: Many documentation sites have for structured content
/llms.txt - Use specific queries: "rust tokio spawn vs spawn_blocking difference" gets better results than "rust tokio"
- Filter by date: Add "2025" or "2026" to queries for current information
- Prefer official docs: Always check official documentation first
- Try multiple sources: If one URL fails, search for alternative mirrors
- 优先检查llms.txt:许多文档站点提供用于获取结构化内容
/llms.txt - 使用精准查询:例如“rust tokio spawn vs spawn_blocking difference”比“rust tokio”能获得更优结果
- 按日期筛选:在查询中加入“2025”或“2026”以获取最新信息
- 优先官方文档:始终优先检查官方文档
- 尝试多来源:若某个URL无法访问,搜索其他镜像站点
Quality Indicators
质量评估指标
Good content has:
- Code examples with language markers
- API signatures and type annotations
- Configuration examples
- Version information
- Clear headings and structure
Poor content has:
- Excessive boilerplate/navigation
- Paywall blocks
- Login requirements
- Heavy advertising
优质内容具备:
- 带语言标记的代码示例
- API签名和类型注解
- 配置示例
- 版本信息
- 清晰的标题和结构
劣质内容特征:
- 过多的冗余内容/导航元素
- 付费墙限制
- 登录要求
- 大量广告
Error Handling
错误处理
- Provider failures should trigger cascade fallback
- Use alternative sources when primary sources fail
- Log errors for debugging
- Fall back to search when direct fetch fails
- 若工具调用失败,触发级联回退流程
- 主来源失败时使用替代来源
- 记录错误用于调试
- 直接抓取失败时回退到搜索
Testing
测试
Run tests:
bash
cd .agents/skills/web-doc-resolver
python -m pytest tests/ -vRun samples:
bash
python samples/sample_basic.py
python samples/sample_json.py运行测试:
bash
cd .agents/skills/web-doc-resolver
python -m pytest tests/ -v运行示例:
bash
python samples/sample_basic.py
python samples/sample_json.pyFiles
文件说明
- - Main implementation (multi-backend)
scripts/resolve.py - - Unit tests
tests/test_resolve.py - - Basic usage examples
samples/sample_basic.py - - JSON output examples
samples/sample_json.py - - Detailed reference documentation
reference.md
- - 核心实现(支持多后端)
scripts/resolve.py - - 单元测试
tests/test_resolve.py - - 基础使用示例
samples/sample_basic.py - - JSON输出示例
samples/sample_json.py - - 详细参考文档
reference.md