web-search

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

web-search Skill

When to Use

使用场景

ALWAYS invoke this skill for any task involving:

Web search
Research or deep investigation
Scraping or data extraction from websites
Finding latest/current information
Comparing options from web sources
Any "find on the web" or "research X" request

Never perform web research using only transient memory or single-channel tools.

任何涉及以下内容的任务，请务必调用此Skill：

网页搜索
主题调研或深度调查
网站信息爬取或数据提取
获取最新/实时信息
对比来自网页的多种选项
任何“在网页查找”或“调研X”的请求

请勿仅使用临时记忆或单渠道工具进行网页调研。

Core Principles (MUST Follow)

核心原则（必须遵守）

Browser-First Anti-Bot Strategy: Prioritize human-like browser scraping using
```
agent-browser
```
(or
```
playwright-cli
```
via browser-switch) for rendered pages to avoid blocks.
Parallel Retrieval: Launch multiple subagents/workers in parallel (Task tool when available), each with a distinct role:
- Discovery: Prefer
```
searxng-search
```
  /
```
searxng-extract
```
  when installed; otherwise native web search, web fetch, Tavily, exa, or other available discovery tools.
- Scraping:
```
agent-browser
```
  (human-like navigation + snapshots);
```
human-search
```
  cascade as fallback.
- Synthesis:
```
deep-research
```
  patterns toward
```
90_synthesis.md
```
  .
- One topic per research subagent: When the user asks about multiple distinct subjects, spawn separate subagents so each worker owns one topic only — each with its own
```
docs/research/YYYYMMDD/{research_topic_slug}/
```
  folder. Do not assign unrelated questions to the same research subagent (avoids cross-contaminated context, checkpoints, and citations).
Immediate Persistence: After every single source (search result, URL extracted, page rendered), append a structured checkpoint to the numbered artifact for that phase (e.g.
```
10_discovery.md
```
,
```
20_sources.md
```
). Do not proceed to the next source until the checkpoint is written. Required fields, example block, and persistence edge cases →
```
references/checkpoint-template.md
```
.
Dated Artifact Convention: Use
```
docs/research/YYYYMMDD/{research_topic_slug}/
```
— for example
```
docs/research/20260430/ai_coding_agents/
```
. The research topic is carried by
```
{research_topic_slug}
```
, not baked into repeated strings inside filenames. Under that folder reuse the same phase filenames every time so layouts stay predictable. Typical files:
- ```
00_plan.md
```
- ```
10_discovery.md
```
- ```
20_sources.md
```
  (checkpoints for fetched pages: browser, extract, etc.)
- ```
90_synthesis.md
```
- ```
YYYYMMDD
```
  : Windows (PowerShell) →
```
Get-Date -Format "yyyyMMdd"
```
  ; Linux/macOS (shell) →
```
date +%Y%m%d
```
  ; generic (any OS) → Python
```
datetime.date.today().strftime("%Y%m%d")
```
  .
Synthesis from Disk Only: The final answer and
```
90_synthesis.md
```
must derive exclusively from persisted artifacts, never solely from conversation memory.
Citations and Traceability: Follow
```
deep-research
```
patterns; citations must tie back to checkpoint entries and on-disk files.

浏览器优先的反机器人策略：优先使用
```
agent-browser
```
（或通过browser-switch调用
```
playwright-cli
```
）以类人方式爬取渲染后的页面，避免被拦截。
并行检索：并行启动多个子Agent/工作进程（若有Task工具则使用），每个进程承担不同角色：
- 检索发现：若已安装
```
searxng-search
```
  /
```
searxng-extract
```
  则优先使用；否则使用原生网页搜索、网页抓取、Tavily、exa或其他可用的检索工具。
- 内容爬取：使用
```
agent-browser
```
  （类人导航+快照）；将
```
human-search
```
  作为级联备选方案。
- 结果合成：遵循
```
deep-research
```
  模式生成
```
90_synthesis.md
```
  。
- 单一子Agent对应单一主题：当用户询问多个不同主题时，生成独立的子Agent，每个工作进程仅负责一个主题——每个主题对应独立的
```
docs/research/YYYYMMDD/{research_topic_slug}/
```
  文件夹。请勿将无关问题分配给同一个调研子Agent（避免上下文、检查点和引用交叉污染）。
即时持久化：在获取每一个来源（搜索结果、提取的URL、渲染后的页面）后，将结构化检查点追加到对应阶段的编号文件中（例如
```
10_discovery.md
```
、
```
20_sources.md
```
）。在检查点写入完成前，请勿处理下一个来源。必填字段、示例块和持久化边缘情况请参考
```
references/checkpoint-template.md
```
。
带日期的文件命名规范：使用
```
docs/research/YYYYMMDD/{research_topic_slug}/
```
格式——例如
```
docs/research/20260430/ai_coding_agents/
```
。调研主题由
```
{research_topic_slug}
```
标识，无需在文件名中重复写入主题内容。该文件夹下每次都使用相同的阶段文件名，确保结构可预测。典型文件包括：
- ```
00_plan.md
```
- ```
10_discovery.md
```
- ```
20_sources.md
```
  （已抓取页面的检查点：浏览器爬取、内容提取等）
- ```
90_synthesis.md
```
- ```
YYYYMMDD
```
  生成方式：Windows（PowerShell）→
```
Get-Date -Format "yyyyMMdd"
```
  ；Linux/macOS（Shell）→
```
date +%Y%m%d
```
  ；通用（任意系统）→ Python
```
datetime.date.today().strftime("%Y%m%d")
```
  。
仅基于磁盘文件合成结果：最终答案和
```
90_synthesis.md
```
必须完全基于持久化的文件生成，不得仅依赖对话记忆。
引用与可追溯性：遵循
```
deep-research
```
模式；引用必须关联到检查点条目和磁盘文件。

Workflow checklist

工作流检查清单

Operational order for a research run. Tool install and orchestration roles → See also — referenced skills; checkpoint layout and fields →

references/checkpoint-template.md

Receive research query
Partition distinct topics → one
```
{research_topic_slug}
```
folder each under
```
docs/research/YYYYMMDD/
```
Launch parallel subagents (Task tool): one subagent per topic for research workers; within each topic, parallelize by role (discovery / scraping / synthesis) as needed
Instruct each parallel subagent to use the referenced web-scraping and research skills and to save dated research files and checkpoints after each finding to avoid losing context. Each subagent should follow this skill’s rules.
After each source fetch: immediately append checkpoint
On write failure: retry once then use fallback file
After sufficient sources: run synthesis using deep-research patterns
Produce final response with citations linking back to persisted files
Confirm every expected artifact exists on disk and treat the run as incomplete until both synthesis and the final answer are grounded in those files (not conversation memory alone)

调研任务的执行顺序。工具安装与编排角色请参考——相关技能；检查点布局与字段请参考

references/checkpoint-template.md

：

接收调研请求
拆分不同主题→ 在
```
docs/research/YYYYMMDD/
```
下为每个主题创建一个
```
{research_topic_slug}
```
文件夹
启动并行子Agent（使用Task工具）：每个调研主题对应一个子Agent；每个主题内可根据角色（检索发现/内容爬取/结果合成）按需并行处理
指示每个并行子Agent使用指定的网页爬取和调研技能，并在每次获取结果后保存带日期的调研文件和检查点，避免丢失上下文。每个子Agent都应遵守此Skill的规则。
获取每个来源后：立即追加检查点
写入失败时：重试一次，然后使用备选文件
获取足够来源后：使用deep-research模式执行结果合成
生成包含引用的最终响应，引用需关联到持久化文件
确认所有预期文件都已保存到磁盘，且只有当合成结果和最终答案均基于这些文件（而非仅依赖对话记忆）时，才视为任务完成，否则任务未完成

相关技能参考（安装与编排角色）

Install missing skills by copying each repo’s skill folder into your agent’s skills directory.

web-search

orchestrates these tools rather than replacing them:

human-search: Primary intelligent cascade. Tiered fallback (native → Python scraper → browser CLI → crawl4ai); use as the default retrieval engine for robustness.
agent-browser: Primary anti-bot scraping. Workflow:
```
open
```
→
```
snapshot -i
```
→ extract using refs →
```
re-snapshot
```
after changes. Use named sessions for parallel work. Strongest human-like resilience.
searxng-search + searxng-extract: Fast, free, unlimited local discovery and extraction (no API keys). Ideal for gathering initial URL candidates.
deep-research: Final synthesis phase — structured reporting, citation management, progressive disclosure, professional formatting for
```
90_synthesis.md
```
.
browser-switch: Picks agent-browser, playwright-cli, or other browser backends based on context.
playwright-cli: Backup browser automation when agent-browser is unavailable or when specific Playwright features are needed.

Tip: After installing missing skills, tell your subagents the skill paths to use, otherwise they might not discover the skills until the agent restart.

如需安装缺失的技能，请将每个仓库的技能文件夹复制到你的Agent技能目录中。

web-search

负责编排这些工具，而非替代它们：

human-search：核心智能级联工具。提供分层备选方案（原生搜索→Python爬虫→浏览器CLI→crawl4ai）；作为默认检索引擎，具备高鲁棒性。
agent-browser：核心反机器人爬取工具。工作流：
```
open
```
→
```
snapshot -i
```
→ 参考相关内容提取信息 → 变更后
```
re-snapshot
```
。并行工作时使用命名会话。具备最强的类人爬取防护能力。
searxng-search + searxng-extract：快速、免费、无限制的本地检索与提取工具（无需API密钥）。适用于收集初始URL候选列表。
deep-research：最终结果合成阶段工具——提供结构化报告、引用管理、渐进式披露功能，为
```
90_synthesis.md
```
生成专业格式内容。
browser-switch：根据上下文选择agent-browser、playwright-cli或其他浏览器后端。
playwright-cli：当agent-browser不可用或需要特定Playwright功能时，作为备用浏览器自动化工具。

提示：安装缺失的技能后，请告知子Agent技能的使用路径，否则它们可能需要等到Agent重启后才能发现这些技能。

web-search

Original

Translation

web-search Skill

web-search Skill

When to Use

使用场景

Core Principles (MUST Follow)

核心原则（必须遵守）

Workflow checklist

工作流检查清单

See also — referenced skills (install + orchestration roles)

相关技能参考（安装与编排角色）