exa-search
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSearching with Exa (Search API)
使用Exa进行搜索(Search API)
This skill is a recipe for consistent web research using Exa’s Search API: you choose the right search mode, apply the right filters, request only the content you need (highlights/text/summary), and return clean citations.
此技能是使用Exa的Search API进行一致性网页研究的操作指南:你可以选择合适的搜索模式,应用恰当的筛选条件,仅请求所需内容(高亮内容/文本/摘要),并返回清晰的引用信息。
Quick start (default path)
快速开始(默认流程)
- Ensure an API key is available as an environment variable:
- (preferred)
EXA_API_KEY - or pass to the scripts.
--api-key
- Run a search (JSON response to stdout):
bash
python {baseDir}/scripts/exa_search.py --query "latest research in LLMs" --type auto --category "research paper" --num-results 5 --highlights --highlights-per-url 3 --num-sentences 2- 确保API密钥已作为环境变量配置:
- (推荐方式)
EXA_API_KEY - 或在脚本中传入参数。
--api-key
- 执行搜索(JSON结果输出到标准输出):
bash
python {baseDir}/scripts/exa_search.py --query "latest research in LLMs" --type auto --category "research paper" --num-results 5 --highlights --highlights-per-url 3 --num-sentences 2Operating principles (always follow)
操作原则(必须遵守)
- Always return URLs. Prefer also returning title + a 1–2 sentence “why this source” note.
- Prefer first for agentic workflows. Escalate to full
highlightsonly when necessary.text - Use filters aggressively: domain allowlists, date windows, and improve relevance and reduce noise.
category - Freshness is explicit: if the user asks for “latest”, “today”, “current”, etc., set a freshness policy (see below).
- Don’t overfetch: cap content length with when requesting
maxCharacters.text - If the user needs hard evidence (numbers, quotes), fetch full text for the top 1–3 pages and verify.
- 始终返回URL。优先同时返回标题+1–2句话的“此来源为何相关”说明。
- **优先使用**适用于智能体工作流。仅在必要时才获取完整
highlights。text - 积极使用筛选器:域名白名单、日期范围和可提升相关性并减少无效信息。
category - 明确时效性:如果用户要求“最新”、“今日”、“当前”等,设置时效性规则(见下文)。
- 避免过度获取:请求时用
text限制内容长度。maxCharacters - 如果用户需要确凿证据(数据、引用),获取前1–3个页面的完整文本并验证。
Workflow
工作流程
Step 1 — Translate the user task into a search plan
步骤1 — 将用户任务转化为搜索计划
Decide:
-
Search type
- (default): best general quality.
auto - : lowest latency, for autocomplete / live suggestions.
instant - : more comprehensive; can use
deep.additionalQueries - /
fast: streamlined alternatives.neural
-
Category (when appropriate)
- ,
news,research paper,company,people,tweet,personal site, etc.financial report
-
Freshness
- If “real-time / latest”: consider live crawling via (see Freshness).
maxAgeHours - If “historical/static”: use cache only (e.g., ).
maxAgeHours: -1
- If “real-time / latest”: consider live crawling via
-
Content mode
- : token-efficient evidence snippets.
highlights - : deep reading (cap via
text).maxCharacters - : quick structured overviews (optionally with a guiding
summary).query
确定:
-
搜索类型
- (默认):综合质量最优。
auto - :延迟最低,适用于自动补全/实时建议。
instant - :更全面;可使用
deep。additionalQueries - /
fast:简化替代方案。neural
-
类别(适用时)
- 、
news、research paper、company、people、tweet、personal site等。financial report
-
时效性
- 如果需要“实时/最新”:考虑通过进行实时抓取(见时效性规则)。
maxAgeHours - 如果需要“历史/静态”内容:仅使用缓存(例如)。
maxAgeHours: -1
- 如果需要“实时/最新”:考虑通过
-
内容模式
- :高效的证据片段。
highlights - :深度阅读(通过
text限制长度)。maxCharacters - :快速结构化概述(可附带引导性
summary)。query
Step 2 — Build the request payload
步骤2 — 构建请求负载
Start from this template and fill only what you need:
json
{
"query": "...",
"type": "auto",
"category": "news",
"numResults": 10,
"includeDomains": ["..."],
"excludeDomains": ["..."],
"startPublishedDate": "2025-01-01T00:00:00.000Z",
"endPublishedDate": "2025-12-31T23:59:59.999Z",
"includeText": ["must contain phrase"],
"excludeText": ["must not contain phrase"],
"contents": {
"highlights": true,
"text": { "maxCharacters": 8000, "includeHtmlTags": false },
"summary": { "query": "..." },
"subpages": 0,
"extras": { "links": 0, "imageLinks": 0 },
"maxAgeHours": 24
}
}Notes:
- is optional. If omitted, you’ll only get metadata (
contents,title, etc.).url - controls when Exa should live-crawl vs use cached content (see below).
maxAgeHours - is deprecated; use
contextorhighlightsinstead.text
从以下模板开始,仅填充所需内容:
json
{
"query": "...",
"type": "auto",
"category": "news",
"numResults": 10,
"includeDomains": ["..."],
"excludeDomains": ["..."],
"startPublishedDate": "2025-01-01T00:00:00.000Z",
"endPublishedDate": "2025-12-31T23:59:59.999Z",
"includeText": ["must contain phrase"],
"excludeText": ["must not contain phrase"],
"contents": {
"highlights": true,
"text": { "maxCharacters": 8000, "includeHtmlTags": false },
"summary": { "query": "..." },
"subpages": 0,
"extras": { "links": 0, "imageLinks": 0 },
"maxAgeHours": 24
}
}注意:
- 为可选参数。若省略,仅会返回元数据(
contents、title等)。url - 控制Exa是实时抓取还是使用缓存内容(见下文)。
maxAgeHours - 已废弃;请改用
context或highlights。text
Step 3 — Execute the request
步骤3 — 执行请求
Option A (recommended): use the bundled script so requests are consistent and validated.
bash
python {baseDir}/scripts/exa_search.py --query "..." --highlights --num-results 10Option B: call the HTTP endpoint directly.
bash
curl --request POST --url https://api.exa.ai/search --header "content-type: application/json" --header "x-api-key: $EXA_API_KEY" --data '{"query":"...","type":"auto","numResults":5}'选项A(推荐): 使用配套脚本,确保请求一致且经过验证。
bash
python {baseDir}/scripts/exa_search.py --query "..." --highlights --num-results 10选项B: 直接调用HTTP端点。
bash
curl --request POST --url https://api.exa.ai/search --header "content-type: application/json" --header "x-api-key: $EXA_API_KEY" --data '{"query":"...","type":"auto","numResults":5}'Step 4 — Post-process results into an answer with citations
步骤4 — 将结果后处理为带引用的回答
- De-duplicate near-identical domains/pages when the user wants breadth.
- Select the top sources (usually 3–7) that jointly cover the claim space.
- For each selected result, extract:
- title, url
- key highlight(s) or a short quote from
text - published date (if available)
- Write the response with inline citations (URLs) and clear uncertainty where needed.
- If the user wants a deliverable (report, memo), preserve a “Sources” section listing all URLs.
- 去重:当用户需要广泛结果时,去除近似重复的域名/页面。
- 选择顶级来源(通常3–7个),共同覆盖所需信息范围。
- 对每个选中的结果,提取:
- 标题、url
- 关键高亮内容或来自的简短引用
text - 发布日期(若可用)
- 撰写回复:包含内联引用(URL),必要时明确标注不确定性。
- 如果用户需要交付物(报告、备忘录),保留“来源”部分列出所有URL。
Freshness policy (use this when “latest/current/today” appears)
时效性规则(当出现“最新/当前/今日”时使用)
Use (or the top-level alias if the API accepts it):
contents.maxAgeHoursmaxAgeHours- : daily-fresh content (use cache if <24h else livecrawl)
24 - : near-real-time (cache if <1h else livecrawl)
1 - : always livecrawl (slowest, most current)
0 - : never livecrawl (fastest; cache only)
-1 - omit: default behaviour (livecrawl only when cache missing)
使用(或API支持的顶层别名):
contents.maxAgeHoursmaxAgeHours- :每日更新内容(若缓存不足24小时则使用缓存,否则实时抓取)
24 - :近实时(若缓存不足1小时则使用缓存,否则实时抓取)
1 - :始终实时抓取(速度最慢,内容最新)
0 - :从不实时抓取(速度最快;仅使用缓存)
-1 - 省略:默认行为(仅当缓存缺失时才实时抓取)
Common patterns
常见模式
Pattern A — “Give me sources for X” (fast + token efficient)
模式A — “给我关于X的来源”(快速且高效)
- ,
type: autonumResults: 5–10 contents.highlights: true- Optional: and
categoryincludeDomains
- ,
type: autonumResults: 5–10 contents.highlights: true- 可选:和
categoryincludeDomains
Pattern B — “Do deep research on X” (read a few pages thoroughly)
模式B — “对X进行深度研究”(深入阅读部分页面)
- Start with highlights on 10–20 results.
- Then fetch full for the top 3–5 URLs with a
textcap.maxCharacters - Summarise with citations.
- 先获取10–20个结果的高亮内容。
- 然后为前3–5个URL获取完整,并设置
text限制。maxCharacters - 附带引用进行总结。
Pattern C — “Latest news about X”
模式C — “关于X的最新新闻”
category: news- Apply a date window () if the question is time-bound.
startPublishedDate - Use a freshness setting (often ).
maxAgeHours: 1–24
category: news- 如果问题有时间限制,应用日期范围()。
startPublishedDate - 使用时效性设置(通常)。
maxAgeHours: 1–24
Pattern D — “Find a company / person page”
模式D — “查找公司/个人页面”
- or
category: companypeople - If using , allowlist LinkedIn domains when needed.
people - IMPORTANT: some filters are unsupported for /
company; see troubleshooting.people
- 或
category: companypeople - 如果使用,必要时将LinkedIn域名加入白名单。
people - 重要提示:/
company类别仅支持有限的筛选器;不支持的参数会触发400错误。people
Troubleshooting
故障排除
401 / 403 (auth)
401 / 403(认证错误)
- Confirm header is present and valid.
x-api-key - Confirm you aren’t accidentally using a placeholder like .
YOUR-EXA-API-KEY
- 确认请求头已存在且有效。
x-api-key - 确认未错误使用占位符如。
YOUR-EXA-API-KEY
400 (invalid parameters)
400(参数无效)
- and
companycategories support a limited set of filters; unsupported parameters can trigger 400 errors.people - If in doubt, remove date and text filters first, then re-add one-by-one.
- 和
company类别仅支持有限的筛选器;不支持的参数会触发400错误。people - 如有疑问,先移除日期和文本筛选器,再逐个重新添加。
Too much content / token blow-ups
内容过多 / 令牌消耗过大
- Prefer over
highlights.text - Cap .
text.maxCharacters - Reduce .
numResults
- 优先使用而非
highlights。text - 设置限制。
text.maxCharacters - 减少数量。
numResults
Bundled references
配套参考资料
- API + parameter cheat sheet:
references/exa-search-api.md - Best-practice recipes:
references/exa-search-best-practices.md - Quickstart snippets (SDK + curl):
references/exa-search-quickstart.md
- API + 参数速查表:
references/exa-search-api.md - 最佳实践指南:
references/exa-search-best-practices.md - 快速入门代码片段(SDK + curl):
references/exa-search-quickstart.md