exa-search

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Searching with Exa (Search API)

使用Exa进行搜索(Search API)

This skill is a recipe for consistent web research using Exa’s Search API: you choose the right search mode, apply the right filters, request only the content you need (highlights/text/summary), and return clean citations.
此技能是使用Exa的Search API进行一致性网页研究的操作指南:你可以选择合适的搜索模式,应用恰当的筛选条件,仅请求所需内容(高亮内容/文本/摘要),并返回清晰的引用信息。

Quick start (default path)

快速开始(默认流程)

  1. Ensure an API key is available as an environment variable:
  • EXA_API_KEY
    (preferred)
  • or pass
    --api-key
    to the scripts.
  1. Run a search (JSON response to stdout):
bash
python {baseDir}/scripts/exa_search.py   --query "latest research in LLMs"   --type auto   --category "research paper"   --num-results 5   --highlights   --highlights-per-url 3   --num-sentences 2
  1. 确保API密钥已作为环境变量配置:
  • EXA_API_KEY
    (推荐方式)
  • 或在脚本中传入
    --api-key
    参数。
  1. 执行搜索(JSON结果输出到标准输出):
bash
python {baseDir}/scripts/exa_search.py   --query "latest research in LLMs"   --type auto   --category "research paper"   --num-results 5   --highlights   --highlights-per-url 3   --num-sentences 2

Operating principles (always follow)

操作原则(必须遵守)

  • Always return URLs. Prefer also returning title + a 1–2 sentence “why this source” note.
  • Prefer
    highlights
    first
    for agentic workflows. Escalate to full
    text
    only when necessary.
  • Use filters aggressively: domain allowlists, date windows, and
    category
    improve relevance and reduce noise.
  • Freshness is explicit: if the user asks for “latest”, “today”, “current”, etc., set a freshness policy (see below).
  • Don’t overfetch: cap content length with
    maxCharacters
    when requesting
    text
    .
  • If the user needs hard evidence (numbers, quotes), fetch full text for the top 1–3 pages and verify.
  • 始终返回URL。优先同时返回标题+1–2句话的“此来源为何相关”说明。
  • **优先使用
    highlights
    **适用于智能体工作流。仅在必要时才获取完整
    text
  • 积极使用筛选器:域名白名单、日期范围和
    category
    可提升相关性并减少无效信息。
  • 明确时效性:如果用户要求“最新”、“今日”、“当前”等,设置时效性规则(见下文)。
  • 避免过度获取:请求
    text
    时用
    maxCharacters
    限制内容长度。
  • 如果用户需要确凿证据(数据、引用),获取前1–3个页面的完整文本并验证。

Workflow

工作流程

Step 1 — Translate the user task into a search plan

步骤1 — 将用户任务转化为搜索计划

Decide:
  1. Search type
    • auto
      (default): best general quality.
    • instant
      : lowest latency, for autocomplete / live suggestions.
    • deep
      : more comprehensive; can use
      additionalQueries
      .
    • fast
      /
      neural
      : streamlined alternatives.
  2. Category (when appropriate)
    • news
      ,
      research paper
      ,
      company
      ,
      people
      ,
      tweet
      ,
      personal site
      ,
      financial report
      , etc.
  3. Freshness
    • If “real-time / latest”: consider live crawling via
      maxAgeHours
      (see Freshness).
    • If “historical/static”: use cache only (e.g.,
      maxAgeHours: -1
      ).
  4. Content mode
    • highlights
      : token-efficient evidence snippets.
    • text
      : deep reading (cap via
      maxCharacters
      ).
    • summary
      : quick structured overviews (optionally with a guiding
      query
      ).
确定:
  1. 搜索类型
    • auto
      (默认):综合质量最优。
    • instant
      :延迟最低,适用于自动补全/实时建议。
    • deep
      :更全面;可使用
      additionalQueries
    • fast
      /
      neural
      :简化替代方案。
  2. 类别(适用时)
    • news
      research paper
      company
      people
      tweet
      personal site
      financial report
      等。
  3. 时效性
    • 如果需要“实时/最新”:考虑通过
      maxAgeHours
      进行实时抓取(见时效性规则)。
    • 如果需要“历史/静态”内容:仅使用缓存(例如
      maxAgeHours: -1
      )。
  4. 内容模式
    • highlights
      :高效的证据片段。
    • text
      :深度阅读(通过
      maxCharacters
      限制长度)。
    • summary
      :快速结构化概述(可附带引导性
      query
      )。

Step 2 — Build the request payload

步骤2 — 构建请求负载

Start from this template and fill only what you need:
json
{
  "query": "...",
  "type": "auto",
  "category": "news",
  "numResults": 10,
  "includeDomains": ["..."],
  "excludeDomains": ["..."],
  "startPublishedDate": "2025-01-01T00:00:00.000Z",
  "endPublishedDate": "2025-12-31T23:59:59.999Z",
  "includeText": ["must contain phrase"],
  "excludeText": ["must not contain phrase"],
  "contents": {
    "highlights": true,
    "text": { "maxCharacters": 8000, "includeHtmlTags": false },
    "summary": { "query": "..." },
    "subpages": 0,
    "extras": { "links": 0, "imageLinks": 0 },
    "maxAgeHours": 24
  }
}
Notes:
  • contents
    is optional. If omitted, you’ll only get metadata (
    title
    ,
    url
    , etc.).
  • maxAgeHours
    controls when Exa should live-crawl vs use cached content (see below).
  • context
    is deprecated; use
    highlights
    or
    text
    instead.
从以下模板开始,仅填充所需内容:
json
{
  "query": "...",
  "type": "auto",
  "category": "news",
  "numResults": 10,
  "includeDomains": ["..."],
  "excludeDomains": ["..."],
  "startPublishedDate": "2025-01-01T00:00:00.000Z",
  "endPublishedDate": "2025-12-31T23:59:59.999Z",
  "includeText": ["must contain phrase"],
  "excludeText": ["must not contain phrase"],
  "contents": {
    "highlights": true,
    "text": { "maxCharacters": 8000, "includeHtmlTags": false },
    "summary": { "query": "..." },
    "subpages": 0,
    "extras": { "links": 0, "imageLinks": 0 },
    "maxAgeHours": 24
  }
}
注意:
  • contents
    为可选参数。若省略,仅会返回元数据(
    title
    url
    等)。
  • maxAgeHours
    控制Exa是实时抓取还是使用缓存内容(见下文)。
  • context
    已废弃;请改用
    highlights
    text

Step 3 — Execute the request

步骤3 — 执行请求

Option A (recommended): use the bundled script so requests are consistent and validated.
bash
python {baseDir}/scripts/exa_search.py --query "..." --highlights --num-results 10
Option B: call the HTTP endpoint directly.
bash
curl --request POST   --url https://api.exa.ai/search   --header "content-type: application/json"   --header "x-api-key: $EXA_API_KEY"   --data '{"query":"...","type":"auto","numResults":5}'
选项A(推荐): 使用配套脚本,确保请求一致且经过验证。
bash
python {baseDir}/scripts/exa_search.py --query "..." --highlights --num-results 10
选项B: 直接调用HTTP端点。
bash
curl --request POST   --url https://api.exa.ai/search   --header "content-type: application/json"   --header "x-api-key: $EXA_API_KEY"   --data '{"query":"...","type":"auto","numResults":5}'

Step 4 — Post-process results into an answer with citations

步骤4 — 将结果后处理为带引用的回答

  1. De-duplicate near-identical domains/pages when the user wants breadth.
  2. Select the top sources (usually 3–7) that jointly cover the claim space.
  3. For each selected result, extract:
    • title, url
    • key highlight(s) or a short quote from
      text
    • published date (if available)
  4. Write the response with inline citations (URLs) and clear uncertainty where needed.
  5. If the user wants a deliverable (report, memo), preserve a “Sources” section listing all URLs.
  1. 去重:当用户需要广泛结果时,去除近似重复的域名/页面。
  2. 选择顶级来源(通常3–7个),共同覆盖所需信息范围。
  3. 对每个选中的结果,提取:
    • 标题、url
    • 关键高亮内容或来自
      text
      的简短引用
    • 发布日期(若可用)
  4. 撰写回复:包含内联引用(URL),必要时明确标注不确定性。
  5. 如果用户需要交付物(报告、备忘录),保留“来源”部分列出所有URL。

Freshness policy (use this when “latest/current/today” appears)

时效性规则(当出现“最新/当前/今日”时使用)

Use
contents.maxAgeHours
(or the
maxAgeHours
top-level alias if the API accepts it):
  • 24
    : daily-fresh content (use cache if <24h else livecrawl)
  • 1
    : near-real-time (cache if <1h else livecrawl)
  • 0
    : always livecrawl (slowest, most current)
  • -1
    : never livecrawl (fastest; cache only)
  • omit: default behaviour (livecrawl only when cache missing)
使用
contents.maxAgeHours
(或API支持的顶层别名
maxAgeHours
):
  • 24
    :每日更新内容(若缓存不足24小时则使用缓存,否则实时抓取)
  • 1
    :近实时(若缓存不足1小时则使用缓存,否则实时抓取)
  • 0
    :始终实时抓取(速度最慢,内容最新)
  • -1
    :从不实时抓取(速度最快;仅使用缓存)
  • 省略:默认行为(仅当缓存缺失时才实时抓取)

Common patterns

常见模式

Pattern A — “Give me sources for X” (fast + token efficient)

模式A — “给我关于X的来源”(快速且高效)

  • type: auto
    ,
    numResults: 5–10
  • contents.highlights: true
  • Optional:
    category
    and
    includeDomains
  • type: auto
    ,
    numResults: 5–10
  • contents.highlights: true
  • 可选:
    category
    includeDomains

Pattern B — “Do deep research on X” (read a few pages thoroughly)

模式B — “对X进行深度研究”(深入阅读部分页面)

  • Start with highlights on 10–20 results.
  • Then fetch full
    text
    for the top 3–5 URLs with a
    maxCharacters
    cap.
  • Summarise with citations.
  • 先获取10–20个结果的高亮内容。
  • 然后为前3–5个URL获取完整
    text
    ,并设置
    maxCharacters
    限制。
  • 附带引用进行总结。

Pattern C — “Latest news about X”

模式C — “关于X的最新新闻”

  • category: news
  • Apply a date window (
    startPublishedDate
    ) if the question is time-bound.
  • Use a freshness setting (often
    maxAgeHours: 1–24
    ).
  • category: news
  • 如果问题有时间限制,应用日期范围(
    startPublishedDate
    )。
  • 使用时效性设置(通常
    maxAgeHours: 1–24
    )。

Pattern D — “Find a company / person page”

模式D — “查找公司/个人页面”

  • category: company
    or
    people
  • If using
    people
    , allowlist LinkedIn domains when needed.
  • IMPORTANT: some filters are unsupported for
    company
    /
    people
    ; see troubleshooting.
  • category: company
    people
  • 如果使用
    people
    ,必要时将LinkedIn域名加入白名单。
  • 重要提示:
    company
    /
    people
    类别仅支持有限的筛选器;不支持的参数会触发400错误。

Troubleshooting

故障排除

401 / 403 (auth)

401 / 403(认证错误)

  • Confirm
    x-api-key
    header is present and valid.
  • Confirm you aren’t accidentally using a placeholder like
    YOUR-EXA-API-KEY
    .
  • 确认
    x-api-key
    请求头已存在且有效。
  • 确认未错误使用占位符如
    YOUR-EXA-API-KEY

400 (invalid parameters)

400(参数无效)

  • company
    and
    people
    categories support a limited set of filters; unsupported parameters can trigger 400 errors.
  • If in doubt, remove date and text filters first, then re-add one-by-one.
  • company
    people
    类别仅支持有限的筛选器;不支持的参数会触发400错误。
  • 如有疑问,先移除日期和文本筛选器,再逐个重新添加。

Too much content / token blow-ups

内容过多 / 令牌消耗过大

  • Prefer
    highlights
    over
    text
    .
  • Cap
    text.maxCharacters
    .
  • Reduce
    numResults
    .
  • 优先使用
    highlights
    而非
    text
  • 设置
    text.maxCharacters
    限制。
  • 减少
    numResults
    数量。

Bundled references

配套参考资料

  • API + parameter cheat sheet:
    references/exa-search-api.md
  • Best-practice recipes:
    references/exa-search-best-practices.md
  • Quickstart snippets (SDK + curl):
    references/exa-search-quickstart.md
  • API + 参数速查表:
    references/exa-search-api.md
  • 最佳实践指南:
    references/exa-search-best-practices.md
  • 快速入门代码片段(SDK + curl):
    references/exa-search-quickstart.md