tavily-dynamic-search

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Tavily Dynamic Search

Tavily 动态搜索

Search the web, filter results, and extract content so that raw search data never enters your context window. Only your curated
print()
output comes back.
搜索网页、过滤结果并提取内容,原始搜索数据绝不会进入你的上下文窗口。只有经过你筛选后的
print()
输出会返回。

Why this matters

为什么这很重要

A typical
tvly search --include-raw-content
returns 8 results × 30-50K chars each = ~300K characters of raw page content. If this enters your context window, you burn tokens reading navigation bars, cookie banners, and boilerplate — and your reasoning quality degrades under the noise. By processing results inside a Python script, only your
print()
output enters context — typically 1-3K characters of pure signal. That's a 100-200x reduction.
常规的
tvly search --include-raw-content
会返回8条结果,每条包含30-50K字符的内容,总计约300K字符的原始页面内容。如果这些内容进入你的上下文窗口,你会耗费大量token去读取导航栏、Cookie提示和冗余代码——推理质量也会被这些噪声拉低。通过在Python脚本内处理结果,只有
print()
输出会进入上下文窗口,通常仅为1-3K字符的有效信息,相当于减少了100-200倍的无效内容。

Background: Programmatic Tool Calling (PTC)

背景:程序化工具调用(PTC)

This skill replicates the architecture of Anthropic's Programmatic Tool Calling (PTC) for web search. PTC lets the model write code that orchestrates tool calls inside a sandbox — intermediate results stay in the sandbox, and only the final
print()
output reaches the model's context window.
This skill applies the same principle using local Python execution. The Python process is the sandbox. Variables in memory hold the raw data. Only what you
print()
crosses into your context window. You write the filtering logic — you decide what matters for each query.
该技能复刻了Anthropic的程序化工具调用(PTC)架构,用于网页搜索。PTC允许模型编写代码在沙箱中协调工具调用——中间结果保留在沙箱内,只有最终的
print()
输出会进入模型的上下文窗口。
本技能通过本地Python执行应用了相同的原理。Python进程就是沙箱,原始数据存储在内存变量中,只有你
print()
的内容会进入上下文窗口。你可以编写过滤逻辑,自主决定每个查询中哪些信息是重要的。

Before running any command

运行命令前

If
tvly
is not found on PATH, install it first:
bash
curl -fsSL https://cli.tavily.com/install.sh | bash && tvly login
如果PATH中未找到
tvly
,请先安装:
bash
curl -fsSL https://cli.tavily.com/install.sh | bash && tvly login

Core Rule

核心规则

NEVER run
tvly
as a bare command. Always process output through Python so you control what enters your context.
bash
undefined
绝对不要直接运行
tvly
命令。务必通过Python处理输出,从而控制进入上下文的内容。
bash
undefined

WRONG — raw results flood your context

错误示例——原始结果会充斥你的上下文

tvly search "quantum computing 2025" --json
tvly search "quantum computing 2025" --json

RIGHT — only your print() output enters context

正确示例——只有你的print()输出会进入上下文

tvly search "quantum computing 2025" --json 2>/dev/null | python3 -c " import json, sys data = json.load(sys.stdin) for r in data['results']: print(f'[{r["score"]:.2f}] {r["title"]}') print(f' {r["url"]}') "
undefined
tvly search "quantum computing 2025" --json 2>/dev/null | python3 -c " import json, sys data = json.load(sys.stdin) for r in data['results']: print(f'[{r["score"]:.2f}] {r["title"]}') print(f' {r["url"]}') "
undefined

JSON Schemas

JSON 格式说明

You need these to write correct filtering code.
你需要以下格式来编写正确的过滤代码。

tvly search --json

tvly search --json

json
{
  "query": "string",
  "answer": "string | null",
  "results": [
    {
      "url": "string",
      "title": "string",
      "content": "string (snippet, ~500-1500 chars)",
      "score": 0.0-1.0,
      "raw_content": "string | null (full page, only with --include-raw-content)"
    }
  ],
  "response_time": 0.0
}
json
{
  "query": "string",
  "answer": "string | null",
  "results": [
    {
      "url": "string",
      "title": "string",
      "content": "string (snippet, ~500-1500 chars)",
      "score": 0.0-1.0,
      "raw_content": "string | null (full page, only with --include-raw-content)"
    }
  ],
  "response_time": 0.0
}

tvly extract --json

tvly extract --json

json
{
  "results": [
    {
      "url": "string",
      "title": "string",
      "raw_content": "string (full page markdown)",
      "images": []
    }
  ],
  "failed_results": [],
  "response_time": 0.0
}
json
{
  "results": [
    {
      "url": "string",
      "title": "string",
      "raw_content": "string (full page markdown)",
      "images": []
    }
  ],
  "failed_results": [],
  "response_time": 0.0
}

How to search

搜索方法

You have two building blocks and two ways to run them. Compose these however the query demands — there are no fixed patterns. You decide the approach based on what you need.
你有两个基础功能模块和两种执行方式。可根据查询需求自由组合——没有固定模式,你可以根据需求选择合适的方式。

Building blocks

基础功能模块

tvly search
— returns titles, URLs, snippets, scores. Optionally includes full page content with
--include-raw-content markdown
.
tvly extract
— fetches full page content for specific URLs. Use when you found a URL from search and need more detail.
tvly search
—— 返回标题、URL、摘要和评分。可通过
--include-raw-content markdown
选项获取完整页面内容。
tvly extract
—— 获取特定URL的完整页面内容。当你通过搜索找到某个URL并需要更多细节时使用。

Execution modes

执行模式

Pipe mode — for simple filters (3-5 lines). Pipe tvly output into
python3 -c
:
bash
tvly search "query" --json 2>/dev/null | python3 -c "
import json, sys
data = json.load(sys.stdin)
管道模式 —— 适用于简单过滤(3-5行代码)。将tvly输出通过管道传入
python3 -c
bash
tvly search "query" --json 2>/dev/null | python3 -c "
import json, sys
data = json.load(sys.stdin)

your filtering code here

此处编写你的过滤代码

"

**Heredoc mode** — for anything more complex. Single Bash call, clean multi-line Python, no escaping, no temp files:

```bash
python3 << 'PYEOF'
import json, subprocess
raw = subprocess.check_output(
    ['tvly', 'search', 'query', '--json'],
    stderr=subprocess.DEVNULL
)
data = json.loads(raw)
for r in data['results']:
    print(f"[{r['score']:.2f}] {r['title']}")
    print(f"  {r['url']}")
PYEOF
Single-quoted heredocs (
<< 'PYEOF'
) don't interpret anything — no escaping needed. This is the default for most tasks.
Script mode — only when you will reuse the same script across multiple turns. Do NOT write one-shot scripts to
/tmp/
. If you run it once, use a heredoc.
Important: save DATA to
/tmp/
, not CODE.
Writing
/tmp/tavily_results.json
(data for later turns) = good. Writing
/tmp/my_filter.py
(one-shot code) = wasteful — use a heredoc instead.
"

**HereDoc模式** —— 适用于更复杂的场景。单次Bash调用,支持清晰的多行Python代码,无需转义,无需临时文件:

```bash
python3 << 'PYEOF'
import json, subprocess
raw = subprocess.check_output(
    ['tvly', 'search', 'query', '--json'],
    stderr=subprocess.DEVNULL
)
data = json.loads(raw)
for r in data['results']:
    print(f"[{r['score']:.2f}] {r['title']}")
    print(f"  {r['url']}")
PYEOF
单引号包裹的HereDoc(
<< 'PYEOF'
)不会解析任何内容——无需转义。这是大多数任务的默认选择。
脚本模式 —— 仅当你需要在多轮对话中重复使用同一脚本时使用。不要为一次性任务编写脚本保存到
/tmp/
。如果仅运行一次,请使用HereDoc模式。
重要提示:仅将数据保存到
/tmp/
,而非代码
。写入
/tmp/tavily_results.json
(供后续轮次使用的数据)是合理的。写入
/tmp/my_filter.py
(一次性代码)是不必要的——请改用HereDoc模式。

Multi-turn iteration

多轮迭代

For complex queries, you often need to explore before you extract — just like PTC, where the model searches, sees titles, decides which results to drill into, then extracts.
The key: save raw results to a file, then process them in separate steps. The file is your persistent state between turns.
对于复杂查询,你通常需要先探索再提取——就像PTC的工作方式:模型先搜索,查看结果标题,决定深入研究哪些结果,然后提取信息。
关键要点:将原始结果保存到文件中,然后分步骤处理。该文件是你在不同轮次对话间的持久化状态。

Turn 1: Search and explore

第一轮:搜索与探索

Search and print only titles + scores. Save raw results to disk for later turns:
bash
python3 << 'PYEOF'
import json, subprocess

raw = subprocess.check_output(
    ['tvly', 'search', 'solid-state battery commercialization 2025',
     '--include-raw-content', 'markdown', '--max-results', '8', '--json'],
    stderr=subprocess.DEVNULL
)
data = json.loads(raw)
执行搜索并仅打印标题+评分。将原始结果保存到磁盘供后续轮次使用:
bash
python3 << 'PYEOF'
import json, subprocess

raw = subprocess.check_output(
    ['tvly', 'search', 'solid-state battery commercialization 2025',
     '--include-raw-content', 'markdown', '--max-results', '8', '--json'],
    stderr=subprocess.DEVNULL
)
data = json.loads(raw)

Save raw results — this stays on disk, never enters context

保存原始结果——文件将存储在磁盘上,不会进入上下文

with open('/tmp/tavily_results.json', 'w') as f: json.dump(data, f)
with open('/tmp/tavily_results.json', 'w') as f: json.dump(data, f)

Print only what you need to decide next steps

仅打印你需要的内容,以决定下一步操作

print(f'{len(data["results"])} results saved to /tmp/tavily_results.json\n') for i, r in enumerate(data['results']): print(f'[{i}] [{r["score"]:.2f}] {r["title"][:90]}') print(f' {r["url"]}') print(f' {r["content"][:150]}') print() PYEOF

Context receives: ~800 tokens of titles + snippets. The 300K of raw page content is in `/tmp/tavily_results.json`, untouched.
print(f'{len(data["results"])} 条结果已保存到 /tmp/tavily_results.json\n') for i, r in enumerate(data['results']): print(f'[{i}] [{r["score"]:.2f}] {r["title"][:90]}') print(f' {r["url"]}') print(f' {r["content"][:150]}') print() PYEOF

上下文将接收:约800个token的标题+摘要。300K的原始页面内容存储在`/tmp/tavily_results.json`中,不会进入上下文。

Turn 2: Extract based on what you saw

第二轮:基于探索结果提取信息

Now you know what's in the results. Write targeted extraction — you decide which results to drill into and what to filter for:
bash
python3 << 'PYEOF'
import json

data = json.load(open('/tmp/tavily_results.json'))
现在你已了解结果内容,编写针对性的提取逻辑——由你决定深入研究哪些结果以及过滤哪些内容:
bash
python3 << 'PYEOF'
import json

data = json.load(open('/tmp/tavily_results.json'))

You chose these indices based on the titles you saw in turn 1

你根据第一轮看到的标题选择了这些索引

for i in [0, 2, 5]: r = data['results'][i] raw = r.get('raw_content', '') or '' if not raw: continue
print(f'## {r["title"]}')
print(f'URL: {r["url"]}\n')

# You write the filtering logic based on the query
# This example extracts paragraphs about specific companies
for para in raw.split('\n\n'):
    para = para.strip()
    if len(para) > 80 and any(kw in para.lower() for kw in
            ['toyota', 'quantumscape', 'samsung', 'commercializ', 'production']):
        print(para)
        print()

print('---\n')
PYEOF

Context receives: ~600 tokens of targeted content. You made the decision about what to keep.
for i in [0, 2, 5]: r = data['results'][i] raw = r.get('raw_content', '') or '' if not raw: continue
print(f'## {r["title"]}')
print(f'URL: {r["url"]}\n')

# 你可根据查询需求编写过滤逻辑
# 本示例提取关于特定公司的段落
for para in raw.split('\n\n'):
    para = para.strip()
    if len(para) > 80 and any(kw in para.lower() for kw in
            ['toyota', 'quantumscape', 'samsung', 'commercializ', 'production']):
        print(para)
        print()

print('---\n')
PYEOF

上下文将接收:约600个token的针对性内容。由你决定保留哪些信息。

Turn 3 (optional): Fetch more detail

第三轮(可选):获取更多细节

If you need more from a specific source:
bash
python3 << 'PYEOF'
import json, subprocess
如果你需要某个特定来源的更多信息:
bash
python3 << 'PYEOF'
import json, subprocess

Fetch a specific URL you identified

获取你指定的URL内容

raw = subprocess.check_output( ['tvly', 'extract', 'https://example.com/article', '--json'], stderr=subprocess.DEVNULL ) data = json.loads(raw) page = data['results'][0] content = page.get('raw_content', '')
raw = subprocess.check_output( ['tvly', 'extract', 'https://example.com/article', '--json'], stderr=subprocess.DEVNULL ) data = json.loads(raw) page = data['results'][0] content = page.get('raw_content', '')

Save for potential further processing

保存内容以供后续可能的处理

with open('/tmp/page_detail.txt', 'w') as f: f.write(content)
with open('/tmp/page_detail.txt', 'w') as f: f.write(content)

Print only the section you care about

仅打印你关心的部分

for line in content.split('\n'): if any(kw in line.lower() for kw in ['timeline', '2025', '2026', 'mass production']): print(line.strip()) PYEOF
undefined
for line in content.split('\n'): if any(kw in line.lower() for kw in ['timeline', '2025', '2026', 'mass production']): print(line.strip()) PYEOF
undefined

When to use multi-turn vs single-turn

何时使用多轮 vs 单轮

Single turn (pipe mode or one script): when you know upfront what you're looking for. Specific factual queries, known keywords.
Multi-turn (save + explore + extract): when you need to see what's available before deciding what to extract. Open-ended research, complex topics, queries where you don't know the right keywords yet.
单轮(管道模式或单个脚本):当你提前明确知道要查找的内容时。适用于特定事实查询、已知关键词的场景。
多轮(保存+探索+提取):当你需要先了解可用信息,再决定提取哪些内容时。适用于开放式研究、复杂主题、你还不知道正确关键词的查询。

Examples

示例

Simple factual lookup (single turn, pipe mode)

简单事实查询(单轮,管道模式)

bash
tvly search "Python 3.13 release date" --max-results 5 --json 2>/dev/null | python3 -c "
import json, sys
data = json.load(sys.stdin)
for r in data['results'][:3]:
    print(f'{r[\"title\"]}')
    print(f'{r[\"content\"][:300]}')
    print()
"
bash
tvly search "Python 3.13 release date" --max-results 5 --json 2>/dev/null | python3 -c "
import json, sys
data = json.load(sys.stdin)
for r in data['results'][:3]:
    print(f'{r[\"title\"]}')
    print(f'{r[\"content\"][:300]}')
    print()
"

Financial data extraction (single turn, heredoc)

财务数据提取(单轮,HereDoc模式)

bash
python3 << 'PYEOF'
import json, subprocess

raw = subprocess.check_output(
    ['tvly', 'search', 'NVIDIA Q4 2025 earnings revenue',
     '--include-raw-content', 'markdown', '--max-results', '5',
     '--json'],
    stderr=subprocess.DEVNULL
)
data = json.loads(raw)

for r in data['results']:
    raw_content = r.get('raw_content', '') or ''
    # For financial queries, look for lines with numbers
    financial_lines = [
        line.strip() for line in raw_content.split('\n')
        if any(kw in line.lower() for kw in
               ['revenue', 'eps', 'earnings', 'margin', 'guidance', 'billion'])
        and any(c.isdigit() for c in line)
        and len(line.strip()) > 30
    ]
    if financial_lines:
        print(f'## {r["title"]}')
        print(f'URL: {r["url"]}')
        for line in financial_lines[:15]:
            print(f'  {line}')
        print()
PYEOF
bash
python3 << 'PYEOF'
import json, subprocess

raw = subprocess.check_output(
    ['tvly', 'search', 'NVIDIA Q4 2025 earnings revenue',
     '--include-raw-content', 'markdown', '--max-results', '5',
     '--json'],
    stderr=subprocess.DEVNULL
)
data = json.loads(raw)

for r in data['results']:
    raw_content = r.get('raw_content', '') or ''
    # 对于财务查询,查找包含数字的行
    financial_lines = [
        line.strip() for line in raw_content.split('\n')
        if any(kw in line.lower() for kw in
               ['revenue', 'eps', 'earnings', 'margin', 'guidance', 'billion'])
        and any(c.isdigit() for c in line)
        and len(line.strip()) > 30
    ]
    if financial_lines:
        print(f'## {r["title"]}')
        print(f'URL: {r["url"]}')
        for line in financial_lines[:15]:
            print(f'  {line}')
        print()
PYEOF

Multi-source research (multi-turn)

多来源研究(多轮)

Turn 1 — broad search + triage:
bash
python3 << 'PYEOF'
import json, subprocess
第一轮 —— 广泛搜索+分类筛选:
bash
python3 << 'PYEOF'
import json, subprocess

Search from multiple angles

从多个角度搜索

queries = [ ('broad', 'EU AI Act implementation timeline 2025'), ('specific', 'EU AI Act high-risk AI systems obligations'), ]
all_results = [] for label, query in queries: raw = subprocess.check_output( ['tvly', 'search', query, '--max-results', '8', '--json'], stderr=subprocess.DEVNULL ) data = json.loads(raw) for r in data['results']: r['_query'] = label all_results.extend(data['results'])
queries = [ ('broad', 'EU AI Act implementation timeline 2025'), ('specific', 'EU AI Act high-risk AI systems obligations'), ]
all_results = [] for label, query in queries: raw = subprocess.check_output( ['tvly', 'search', query, '--max-results', '8', '--json'], stderr=subprocess.DEVNULL ) data = json.loads(raw) for r in data['results']: r['_query'] = label all_results.extend(data['results'])

Deduplicate by URL

按URL去重

seen = set() unique = [] for r in all_results: if r['url'] not in seen: seen.add(r['url']) unique.append(r)
seen = set() unique = [] for r in all_results: if r['url'] not in seen: seen.add(r['url']) unique.append(r)

Save all results

保存所有结果

with open('/tmp/eu_ai_results.json', 'w') as f: json.dump(unique, f)
with open('/tmp/eu_ai_results.json', 'w') as f: json.dump(unique, f)

Print triage

打印分类筛选结果

unique.sort(key=lambda r: r['score'], reverse=True) print(f'{len(unique)} unique results from {len(queries)} queries\n') for i, r in enumerate(unique[:10]): print(f'[{i}] [{r["score"]:.2f}] ({r["_query"]}) {r["title"][:80]}') print(f' {r["url"]}') print(f' {r["content"][:120]}') print() PYEOF

**Turn 2** — you see the triage, pick the best sources, and extract:

```bash
python3 << 'PYEOF'
import json, subprocess

results = json.load(open('/tmp/eu_ai_results.json'))
unique.sort(key=lambda r: r['score'], reverse=True) print(f'{len(unique)} 条唯一结果,来自 {len(queries)} 个查询\n') for i, r in enumerate(unique[:10]): print(f'[{i}] [{r["score"]:.2f}] ({r["_query"]}) {r["title"][:80]}') print(f' {r["url"]}') print(f' {r["content"][:120]}') print() PYEOF

**第二轮** —— 查看分类筛选结果,选择最佳来源并提取信息:

```bash
python3 << 'PYEOF'
import json, subprocess

results = json.load(open('/tmp/eu_ai_results.json'))

Fetch full content for the top 3 (you chose these based on turn 1)

为前3个结果获取完整内容(你根据第一轮结果选择了这些)

for r in [results[0], results[2], results[4]]: try: raw = subprocess.check_output( ['tvly', 'extract', r['url'], '--json'], stderr=subprocess.DEVNULL, timeout=30 ) page = json.loads(raw) if not page.get('results'): continue content = page['results'][0].get('raw_content', '')
    # Your filtering logic — tailored to this query
    print(f'## {r["title"]}')
    print(f'URL: {r["url"]}\n')

    for para in content.split('\n\n'):
        para = para.strip()
        if len(para) > 100 and any(kw in para.lower() for kw in
                ['high-risk', 'prohibited', 'deadline', 'obligation',
                 'compliance', 'penalty', 'fine', 'article']):
            print(para)
            print()

    print('---\n')
except Exception:
    continue
PYEOF
undefined
for r in [results[0], results[2], results[4]]: try: raw = subprocess.check_output( ['tvly', 'extract', r['url'], '--json'], stderr=subprocess.DEVNULL, timeout=30 ) page = json.loads(raw) if not page.get('results'): continue content = page['results'][0].get('raw_content', '')
    # 针对本次查询编写的过滤逻辑
    print(f'## {r["title"]}')
    print(f'URL: {r["url"]}\n')

    for para in content.split('\n\n'):
        para = para.strip()
        if len(para) > 100 and any(kw in para.lower() for kw in
                ['high-risk', 'prohibited', 'deadline', 'obligation',
                 'compliance', 'penalty', 'fine', 'article']):
            print(para)
            print()

    print('---\n')
except Exception:
    continue
PYEOF
undefined

Following leads across turns

跨轮次追踪线索

Sometimes turn 2 reveals new URLs or topics to chase. You can keep iterating:
bash
python3 << 'PYEOF'
import json, subprocess
有时第二轮会发现新的URL或需要进一步研究的主题。你可以持续迭代:
bash
python3 << 'PYEOF'
import json, subprocess

Read the page you saved earlier

读取你之前保存的页面内容

with open('/tmp/page_detail.txt') as f: content = f.read()
with open('/tmp/page_detail.txt') as f: content = f.read()

You noticed a reference to a specific regulation document

你注意到了一个特定法规文档的引用

Search for it specifically

专门搜索该文档

raw = subprocess.check_output( ['tvly', 'search', 'EU AI Act Annex III high-risk list', '--include-domains', 'eur-lex.europa.eu', '--max-results', '3', '--json'], stderr=subprocess.DEVNULL ) data = json.loads(raw)
for r in data['results']: print(f'## {r["title"]}') print(f'URL: {r["url"]}') print(r['content']) print() PYEOF

Each turn, you save data to `/tmp/`, decide what to explore next, and write new filtering code as heredocs. The raw data accumulates on disk; your context stays lean.
raw = subprocess.check_output( ['tvly', 'search', 'EU AI Act Annex III high-risk list', '--include-domains', 'eur-lex.europa.eu', '--max-results', '3', '--json'], stderr=subprocess.DEVNULL ) data = json.loads(raw)
for r in data['results']: print(f'## {r["title"]}') print(f'URL: {r["url"]}') print(r['content']) print() PYEOF

每一轮,你将数据保存到`/tmp/`,决定下一步探索方向,并编写新的过滤代码(使用HereDoc模式)。原始数据存储在磁盘上;你的上下文始终保持精简。

Writing your filtering code

编写过滤代码的原则

The Python you write IS the filtering logic. There are no fixed templates — you write code that makes sense for the specific query. Here are principles, not rules:
Triage first. Inspect titles and scores before fetching full pages. Don't extract everything blindly.
Be specific. A financial query should filter for numbers and financial terms. A technical query should look for code blocks and specifications. A news query should look for dates and quotes. Match your filtering to the query.
Structural filtering helps. Skip lines shorter than ~50-80 chars (usually nav elements). Skip common boilerplate phrases. Keep headings and their following paragraphs. But these are starting points — adapt based on what you see.
Print structured output. Format your output so it's easy to reason over:
python
print(f'## {title}')
print(f'URL: {url}')
print(relevant_content)
print()
Handle errors. Pages fail, URLs 404, extractions timeout. Use try/except and skip failures:
python
try:
    raw = subprocess.check_output(['tvly', 'extract', url, '--json'],
                                   stderr=subprocess.DEVNULL, timeout=30)
except Exception:
    continue
Token budget awareness. Your
print()
output is what enters your context. Target 150-600 tokens per source. If you're printing 5000+ chars from a single page, you're probably not filtering enough. But if a source has a critical data table, it's fine to keep more.
你编写的Python代码就是过滤逻辑。没有固定模板——你需要为特定查询编写合理的代码。以下是一些原则,而非规则:
先分类筛选。在获取完整页面内容前,先查看标题和评分。不要盲目提取所有内容。
保持针对性。财务查询应过滤数字和财务术语。技术查询应查找代码块和技术规格。新闻查询应查找日期和引述。根据查询需求调整过滤逻辑。
结构化过滤会有所帮助。跳过长度约50-80字符以下的行(通常是导航元素)。跳过常见的冗余短语。保留标题及其后续段落。但这些只是起点——需根据实际情况调整。
打印结构化输出。格式化输出内容,以便于推理:
python
print(f'## {title}')
print(f'URL: {url}')
print(relevant_content)
print()
处理错误。页面加载失败、URL返回404、提取超时等情况时有发生。使用try/except跳过失败的请求:
python
try:
    raw = subprocess.check_output(['tvly', 'extract', url, '--json'],
                                   stderr=subprocess.DEVNULL, timeout=30)
except Exception:
    continue
关注token预算。你的
print()
输出会进入上下文。每个来源的目标token数为150-600。如果从单个页面打印5000+字符,说明你的过滤不够充分。但如果某个来源包含关键的数据表格,保留更多内容也是合理的。

Options

可选参数

All standard
tvly search
options work:
OptionDescription
--max-results
Number of results (default: 5, max: 20)
--depth
ultra-fast
,
fast
,
basic
(default),
advanced
--time-range
day
,
week
,
month
,
year
--include-domains
Comma-separated whitelist
--exclude-domains
Comma-separated blacklist
--include-raw-content
Full page content (
markdown
or
text
)
--country
Boost results from country
所有标准的
tvly search
参数均适用:
参数描述
--max-results
结果数量(默认:5,最大值:20)
--depth
搜索深度:
ultra-fast
fast
basic
(默认)、
advanced
--time-range
时间范围:
day
week
month
year
--include-domains
逗号分隔的域名白名单
--exclude-domains
逗号分隔的域名黑名单
--include-raw-content
获取完整页面内容(
markdown
text
格式)
--country
优先返回指定国家的结果

Fallback: jq

备选方案:jq

When
python3
is unavailable, use
jq
for basic filtering:
bash
tvly search "query" --json 2>/dev/null | jq '[.results[] | select(.score > 0.5) | {title, url, content}]'
jq can't do multi-step search-then-extract or complex filtering. Use it only for simple lookups.
当无法使用
python3
时,可使用
jq
进行基础过滤:
bash
tvly search "query" --json 2>/dev/null | jq '[.results[] | select(.score > 0.5) | {title, url, content}]'
jq无法执行多步骤的“搜索-提取”操作或复杂过滤。仅适用于简单查询。