semanticscholar-skill

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Semantic Scholar Search Workflow

Semantic Scholar 搜索工作流

Search academic papers via the Semantic Scholar API using a structured 4-phase workflow.

Critical rule: NEVER make multiple sequential Bash calls for API requests. Always write ONE Python script that runs all searches, then execute it once. All rate limiting is handled inside

s2.py

automatically.

通过结构化的4阶段工作流，使用Semantic Scholar API搜索学术论文。

核心规则： 绝不要进行多次连续的Bash调用来发起API请求。始终编写一个Python脚本执行所有搜索，然后仅运行一次脚本。所有速率限制由

s2.py

自动处理。

Phase 1: Understand & Plan

阶段1：理解与规划

Parse the user's intent and choose a search strategy:

解析用户意图并选择搜索策略：

Decision Tree

决策树

User wants...	Strategy	Function
Broad topic exploration	Relevance search	`search_relevance()`
Precise technical terms, exact phrases	Bulk search with boolean operators	`search_bulk()` with `build_bool_query()`
Specific passages or methods	Snippet search	`search_snippets()`
Known paper by title	Title match	`match_title()`
Known paper by DOI/PMID/ArXiv	Direct lookup	`get_paper()`
Papers citing a known work	Citation traversal	`get_citations()`
Related to one paper	Single-seed recommendations	`find_similar()`
Related to multiple papers	Multi-seed recommendations	`recommend()`
Find a researcher	Author search	`search_authors()`
Researcher's profile	Author details	`get_author()`
Researcher's publications	Author papers	`get_author_papers()`

用户需求...	策略	函数
宽泛主题探索	相关性搜索	`search_relevance()`
精准技术术语、精确短语	带布尔运算符的批量搜索	`search_bulk()` + `build_bool_query()`
特定段落或方法	片段搜索	`search_snippets()`
通过标题查找已知论文	标题匹配	`match_title()`
通过DOI/PMID/ArXiv查找已知论文	直接查询	`get_paper()`
引用某篇已知论文的文献	引用遍历	`get_citations()`
与某篇论文相关的文献	单种子推荐	`find_similar()`
与多篇论文相关的文献	多种子推荐	`recommend()`
查找研究者	作者搜索	`search_authors()`
研究者个人资料	作者详情	`get_author()`
研究者发表的论文	作者论文列表	`get_author_papers()`

Query Construction Rules

查询构建规则

Ambiguous terms (e.g., "stem cells" could mean mesenchymal or stem-like T cells): Use

build_bool_query()

with exact phrases and exclusions

Example:

build_bool_query(phrases=["stem-like T cells"], required=["CD4", "TCF7"], excluded=["mesenchymal", "hematopoietic stem cell"])

Multi-context queries (e.g., "topic X in cancer AND autoimmunity"): Plan separate searches, deduplicate with
```
deduplicate()
```
Broad topics: Use
```
search_relevance()
```
with filters (year, venue, fieldsOfStudy, minCitationCount)

歧义术语（例如“stem cells”可能指间充质干细胞或类干细胞T细胞）：使用

build_bool_query()

结合精确短语和排除项

示例：

build_bool_query(phrases=["stem-like T cells"], required=["CD4", "TCF7"], excluded=["mesenchymal", "hematopoietic stem cell"])

多语境查询（例如“肿瘤与自身免疫中的X主题”）：规划单独搜索，使用
```
deduplicate()
```
去重
宽泛主题：使用
```
search_relevance()
```
并结合筛选器（年份、期刊、研究领域、最低引用数）

Plan Filters

筛选器规划

Filter	Use when
`year="2020-"`	Recent work only
`publication_date="2024-01-01:2024-06-30"`	Precise date range (YYYY-MM-DD)
`fields_of_study="Medicine"`	Restrict to domain
`min_citations=10`	Only established papers
`pub_types="Review"`	Find reviews/meta-analyses
`pub_types="ClinicalTrial"`	Clinical trials only
`open_access=True`	Only open access papers

Checkpoint: Before proceeding, verify: (1) search strategy matches user intent, (2) filters are appropriate, (3) query is specific enough to avoid irrelevant results.

筛选器	使用场景
`year="2020-"`	仅查找近期成果
`publication_date="2024-01-01:2024-06-30"`	精确日期范围（YYYY-MM-DD）
`fields_of_study="Medicine"`	限定领域范围
`min_citations=10`	仅查找已确立影响力的论文
`pub_types="Review"`	查找综述/荟萃分析
`pub_types="ClinicalTrial"`	仅查找临床试验
`open_access=True`	仅查找开放获取论文

检查点： 继续执行前，请确认：(1) 搜索策略匹配用户意图，(2) 筛选器设置合理，(3) 查询足够具体以避免无关结果。

Phase 2: Execute Search

阶段2：执行搜索

Write ONE Python script that begins with the standard prelude below, then runs all searches:

python

undefined

编写一个Python脚本，开头包含以下标准前置代码，然后运行所有搜索：

python

undefined

--- Standard prelude (use in every script) ---

import sys, os, glob _candidates = [ os.path.expanduser("~~/.claude/skills/semanticscholar-skill"), os.path.expanduser("~~/.openclaw/skills/semanticscholar-skill"), *glob.glob(os.path.expanduser("~~/.claude/plugins/**/semanticscholar-skill"), recursive=True), *glob.glob(os.path.expanduser("~~/.codex/skills/semanticscholar-skill")), ".", ] SKILL_DIR = next((p for p in _candidates if os.path.isfile(os.path.join(p, "s2.py"))), None) if SKILL_DIR is None: raise RuntimeError("Cannot locate semanticscholar-skill (s2.py not found)") sys.path.insert(0, SKILL_DIR) from s2 import *

--- end prelude ---

Build precise query

q = build_bool_query( phrases=["stem-like T cells"], required=["CD4", "IBD"], excluded=["mesenchymal"] ) papers = search_bulk(q, max_results=30, year="2018-", fields_of_study="Medicine") papers = deduplicate(papers)

print(format_results(papers, "Stem-like CD4 T cells in IBD"))


Save to `/tmp/s2_search.py`, then run with `python3 /tmp/s2_search.py` in a single Bash call. Rate limiting, retries, and backoff are automatic inside `s2.py`.

**Checkpoint:** Verify the script ran successfully (no exceptions) and returned results. If 0 results, broaden the query or relax filters before presenting.

print(format_results(papers, "Stem-like CD4 T cells in IBD"))


保存至`/tmp/s2_search.py`，然后通过单次Bash调用`python3 /tmp/s2_search.py`运行。速率限制、重试和退避机制由`s2.py`自动处理。

**检查点：** 确认脚本运行成功（无异常）并返回结果。如果返回0条结果，请在展示前放宽查询条件或筛选器。

Worked Examples

示例实践

Each example below assumes the standard prelude from Phase 2 is at the top of the script.

Example 1: Author workflow — "Find papers by Yann LeCun on self-supervised learning"

python

authors = search_authors("Yann LeCun", max_results=5)
print(format_authors(authors))

以下每个示例均假设脚本顶部包含阶段2中的标准前置代码。

示例1：作者工作流 — "查找Yann LeCun关于自监督学习的论文"

python

authors = search_authors("Yann LeCun", max_results=5)
print(format_authors(authors))

Use the first match's ID to get their papers

使用第一个匹配结果的ID获取其论文

author_id = authors[0]["authorId"] papers = get_author_papers(author_id, max_results=50)

Filter locally for topic

本地筛选主题

ssl_papers = [p for p in papers if "self-supervised" in (p.get("title") or "").lower()] print(format_results(ssl_papers, "Yann LeCun - Self-Supervised Learning"))


**Example 2: Citation chain with intent** — "Who cited the Transformer paper and how did they use it?"

```python
paper = get_paper("DOI:10.48550/arXiv.1706.03762")
print(f"Title: {paper['title']}, Citations: {paper['citationCount']}")

ssl_papers = [p for p in papers if "self-supervised" in (p.get("title") or "").lower()] print(format_results(ssl_papers, "Yann LeCun - Self-Supervised Learning"))


**示例2：带意图的引用链** — "谁引用了Transformer论文，以及他们如何使用该模型？"

```python
paper = get_paper("DOI:10.48550/arXiv.1706.03762")
print(f"Title: {paper['title']}, Citations: {paper['citationCount']}")

Citation envelopes carry contextsWithIntent — keep them, don't flatten.

引用信封包含contextsWithIntent — 保留完整结构，不要扁平化。

citing = get_citations(paper["paperId"], max_results=50) citing.sort(key=lambda c: (c.get("citingPaper") or {}).get("citationCount", 0), reverse=True) print(format_citations(citing, max_items=10)) # renders intent labels + context snippet


**Example 3: Multi-seed recommendations with BibTeX export** — "Find papers like these two but not about NLP"

```python
recs = recommend(
    positive_ids=["DOI:10.1038/nature14539", "ARXIV:2010.11929"],
    negative_ids=["ARXIV:1706.03762"],
    limit=20
)
print(format_results(recs, "Vision papers like Deep Learning & ViT, excluding NLP"))


**示例3：多种子推荐与BibTeX导出** — "查找类似这两篇但与NLP无关的论文"

```python
recs = recommend(
    positive_ids=["DOI:10.1038/nature14539", "ARXIV:2010.11929"],
    negative_ids=["ARXIV:1706.03762"],
    limit=20
)
print(format_results(recs, "Vision papers like Deep Learning & ViT, excluding NLP"))

Export BibTeX for top results

导出前10条结果的BibTeX

bib_data = batch_papers([r["paperId"] for r in recs[:10]], fields="title,citationStyles") print(export_bibtex(bib_data))

undefined

bib_data = batch_papers([r["paperId"] for r in recs[:10]], fields="title,citationStyles") print(export_bibtex(bib_data))

undefined

Phase 3: Summarize & Present

阶段3：总结与展示

Use
```
format_results()
```
for consistent output (summary table + top-10 details)
If user's language is Chinese, present summaries in Chinese
Always note total results count and search strategy used
Highlight most relevant papers based on the user's specific question

使用
```
format_results()
```
生成一致的输出（汇总表格 + 前10条详情）
如果用户使用中文，以中文展示摘要
始终注明结果总数和所使用的搜索策略
根据用户的具体问题高亮最相关的论文

Phase 4: User Interaction Loop

阶段4：用户交互循环

After presenting results, always offer these options:

Translate — titles/summaries to Chinese (or other language)
Details — full abstract for specific paper numbers
Refine — narrow or expand search with different terms/filters
Similar — find papers similar to a specific result (
```
find_similar()
```
)
Citations — who cited a specific paper and how (
```
get_citations()
```
+
```
format_citations()
```
for intent labels)

Export — save results via

export_bibtex()

export_markdown()

, or

export_json()

Done — end search session

Loop until user says done. Each follow-up uses the same single-script pattern.

展示结果后，始终提供以下选项：

翻译 — 将标题/摘要翻译成中文（或其他语言）
详情 — 获取特定论文编号的完整摘要
优化 — 使用不同术语/筛选器缩小或扩大搜索范围
相似文献 — 查找与特定结果相似的论文（
```
find_similar()
```
）
引用情况 — 查看谁引用了特定论文及引用意图（
```
get_citations()
```
+
```
format_citations()
```
生成意图标签）

导出 — 通过

export_bibtex()

、

export_markdown()

或

export_json()

保存结果

完成 — 结束搜索会话

循环直到用户表示完成。每次后续操作均遵循单脚本模式。

API Quick Reference

API快速参考

Helper Module (

s2.py

)

辅助模块（

s2.py

）

Use the standard prelude from Phase 2 at the top of every script. Then call any of the functions below — the module's docstring (

help(s2)

or read

s2.py

) lists each by phase with one-line summaries.

在每个脚本顶部使用阶段2中的标准前置代码。然后调用以下任意函数 — 模块的文档字符串（

help(s2)

或阅读

s2.py

）按阶段列出了每个函数及其单行摘要。

Paper Search Functions

论文搜索函数

Function	Purpose	Max Results
`search_relevance(query, **filters)`	Simple broad search	1,000
`search_bulk(query, sort=..., **filters)`	Boolean precise search	10,000,000
`search_snippets(query, paper_ids=, authors=, inserted_before=, **filters)`	Full-text passage search	1,000
`match_title(title)`	Exact title match	1
`paper_autocomplete(query)`	Query-completion suggestions	—
`get_paper(paper_id)`	Single paper details	—
`get_citations(paper_id, max_results, publication_date=)`	Who cited this	10,000
`get_references(paper_id, max_results)`	What this cites	10,000
`find_similar(paper_id, limit, pool)`	Single-seed recommendations	500
`recommend(positive_ids, negative_ids, limit)`	Multi-seed recommendations	500
`batch_papers(ids, fields)`	Batch lookup (≤500)	—

函数	用途	最大结果数
`search_relevance(query, **filters)`	简单宽泛搜索	1,000
`search_bulk(query, sort=..., **filters)`	布尔精准搜索	10,000,000
`search_snippets(query, paper_ids=, authors=, inserted_before=, **filters)`	全文段落搜索	1,000
`match_title(title)`	精确标题匹配	1
`paper_autocomplete(query)`	查询补全建议	—
`get_paper(paper_id)`	单篇论文详情	—
`get_citations(paper_id, max_results, publication_date=)`	引用该论文的文献	10,000
`get_references(paper_id, max_results)`	该论文引用的文献	10,000
`find_similar(paper_id, limit, pool)`	单种子推荐	500
`recommend(positive_ids, negative_ids, limit)`	多种子推荐	500
`batch_papers(ids, fields)`	批量查询（≤500）	—

Author Functions

作者相关函数

Function	Purpose	Max Results
`search_authors(query, max_results)`	Find researchers by name	1,000
`get_author(author_id)`	Author profile (affiliations, h-index)	—
`get_author_papers(author_id, max_results, publication_date=)`	Author's publications	10,000
`get_paper_authors(paper_id, max_results)`	Paper's author list	1,000
`batch_authors(ids, fields)`	Batch author lookup (≤1000)	—

函数	用途	最大结果数
`search_authors(query, max_results)`	通过姓名查找研究者	1,000
`get_author(author_id)`	作者资料（机构、h指数）	—
`get_author_papers(author_id, max_results, publication_date=)`	作者发表的论文	10,000
`get_paper_authors(paper_id, max_results)`	论文的作者列表	1,000
`batch_authors(ids, fields)`	批量作者查询（≤1000）	—

Filter Parameters (kwargs)

筛选器参数（关键字参数）

snake_case kwargs are translated to S2 camelCase params automatically (

fields_of_study

→

fieldsOfStudy

min_citations

→

minCitationCount

publication_date

→

publicationDateOrYear

pub_types

→

publicationTypes

open_access

→

openAccessPdf

). Use snake_case here.

year

publication_date

venue

fields_of_study

min_citations

pub_types

open_access

```
year
```
:
```
"2020-"
```
,
```
"-2019"
```
,
```
"2016-2020"
```
```
publication_date
```
:
```
"2024-01-01:2024-06-30"
```
(YYYY-MM-DD range, open-ended OK)

pub_types

Review

JournalArticle

Conference

ClinicalTrial

MetaAnalysis

Dataset

Book

CaseReport

Editorial

LettersAndComments

News

Study

BookSection

蛇形命名的关键字参数会自动转换为S2的驼峰命名参数（

fields_of_study

→

fieldsOfStudy

、

min_citations

→

minCitationCount

、

publication_date

→

publicationDateOrYear

、

pub_types

→

publicationTypes

、

open_access

→

openAccessPdf

）。此处使用蛇形命名。

year

publication_date

venue

fields_of_study

min_citations

pub_types

open_access

```
year
```
:
```
"2020-"
```
,
```
"-2019"
```
,
```
"2016-2020"
```
```
publication_date
```
:
```
"2024-01-01:2024-06-30"
```
（YYYY-MM-DD范围，支持开放式区间）

pub_types

Review

JournalArticle

Conference

ClinicalTrial

MetaAnalysis

Dataset

Book

CaseReport

Editorial

LettersAndComments

News

Study

BookSection

Boolean Query Syntax (bulk search only)

布尔查询语法（仅批量搜索可用）

Syntax	Example	Meaning
`"..."`	`"deep learning"`	Exact phrase
`+`	`+transformer`	Must include
`-`	`-survey`	Exclude
`\|`	`CNN \| RNN`	OR
`*`	`neuro*`	Prefix wildcard
`()`	`(CNN \| RNN) +attention`	Grouping

Use

build_bool_query(phrases, required, excluded, or_terms)

to construct safely.

语法	示例	含义
`"..."`	`"deep learning"`	精确短语
`+`	`+transformer`	必须包含
`-`	`-survey`	排除
`\	`	`CNN \
`*`	`neuro*`	前缀通配符
`()`	`(CNN \	RNN) +attention`

使用

build_bool_query(phrases, required, excluded, or_terms)

安全构建查询。

Output Functions

输出函数

Function	Purpose
`format_table(papers, max_rows=30)`	Markdown summary table
`format_details(papers, max_papers=10)`	Detailed entries with TLDR/abstract
`format_citations(citations, max_items=10)`	Citation envelopes with intent labels + context snippet
`format_results(papers, query_desc)`	Combined: summary + table + details
`format_authors(authors, max_rows=20)`	Author table (name, affiliations, h-index)
`export_bibtex(papers)`	BibTeX entries (requires `citationStyles` field)
`export_markdown(papers, query_desc)`	Full markdown report saved to file
`export_json(papers, path)`	JSON export saved to file
`deduplicate(papers)`	Remove duplicates by paperId

函数	用途
`format_table(papers, max_rows=30)`	Markdown汇总表格
`format_details(papers, max_papers=10)`	包含TLDR/摘要的详细条目
`format_citations(citations, max_items=10)`	带意图标签和上下文片段的引用信封
`format_results(papers, query_desc)`	组合输出：摘要 + 表格 + 详情
`format_authors(authors, max_rows=20)`	作者表格（姓名、机构、h指数）
`export_bibtex(papers)`	BibTeX条目（需要 `citationStyles` 字段）
`export_markdown(papers, query_desc)`	保存为完整Markdown报告文件
`export_json(papers, path)`	保存为JSON导出文件
`deduplicate(papers)`	通过paperId去重

Supported ID Formats

支持的ID格式

DOI:10.1038/...

ARXIV:2106.15928

PMID:19872477

PMCID:PMC2323569

CorpusId:215416146

ACL:2020.acl-main.447

DBLP:conf/acl/...

MAG:3015453090

URL:https://...

DOI:10.1038/...

ARXIV:2106.15928

PMID:19872477

PMCID:PMC2323569

CorpusId:215416146

ACL:2020.acl-main.447

DBLP:conf/acl/...

MAG:3015453090

URL:https://...

Paper Fields

论文字段

Default:

title,year,citationCount,authors,venue,externalIds,tldr

Additional:

abstract

references

citations

openAccessPdf

publicationDate

publicationVenue

fieldsOfStudy

s2FieldsOfStudy

journal

isOpenAccess

referenceCount

influentialCitationCount

citationStyles

embedding

textAvailability

Author fields:

name

affiliations

paperCount

citationCount

hIndex

homepage

externalIds

papers

默认字段：

title,year,citationCount,authors,venue,externalIds,tldr

额外字段：

abstract

references

citations

openAccessPdf

publicationDate

publicationVenue

fieldsOfStudy

s2FieldsOfStudy

journal

isOpenAccess

referenceCount

influentialCitationCount

citationStyles

embedding

textAvailability

作者字段：

name

affiliations

paperCount

citationCount

hIndex

homepage

externalIds

papers

Rate Limiting

速率限制

Handled automatically by

s2.py

: 1.1s gap between requests, exponential backoff (2s→4s→8s→16s→32s, max 60s) on 429/504 errors, up to 5 retries.

由

s2.py

自动处理：请求间隔1.1秒，遇到429/504错误时指数退避（2s→4s→8s→16s→32s，最大60s），最多重试5次。

Troubleshooting

故障排除

Error	Cause	Fix
`HTTPError 403`	Missing or invalid API key	Verify `S2_API_KEY` is set: `echo $S2_API_KEY`
`HTTPError 429` after 5 retries	Sustained rate limit exceeded	Wait 60s, reduce `max_results` , or split into smaller batches
`ModuleNotFoundError: s2`	Skill directory not on path	Verify skill is installed at `~/.claude/skills/` , `~/.openclaw/skills/` , or as a Claude Code plugin under `~/.claude/plugins/`
`ModuleNotFoundError: requests`	`requests` not installed	`pip install requests` or `uv pip install requests`
0 results returned	Query too specific or filters too narrow	Broaden query, remove filters, try `search_relevance()` instead of `search_bulk()`
`KeyError: 'data'`	Endpoint returned error object	Check `r.get("message")` for API error details
`tldr` field is empty	Not all papers have TLDR	Fall back to `abstract` field; bulk search never returns `tldr`

错误	原因	解决方法
`HTTPError 403`	API密钥缺失或无效	确认 `S2_API_KEY` 已设置： `echo $S2_API_KEY`
5次重试后仍出现 `HTTPError 429`	持续超出速率限制	等待60秒，减少 `max_results` ，或拆分为更小的批次
`ModuleNotFoundError: s2`	技能目录不在路径中	确认技能已安装在 `~/.claude/skills/` 、 `~/.openclaw/skills/` 或Claude Code插件目录 `~/.claude/plugins/` 下
`ModuleNotFoundError: requests`	`requests` 未安装	`pip install requests` 或 `uv pip install requests`
返回0条结果	查询过于具体或筛选器过于严格	放宽查询条件，移除筛选器，尝试使用 `search_relevance()` 替代 `search_bulk()`
`KeyError: 'data'`	端点返回错误对象	查看 `r.get("message")` 获取API错误详情
`tldr` 字段为空	并非所有论文都有TLDR	回退使用 `abstract` 字段；批量搜索从不返回 `tldr`

semanticscholar-skill

Original

Translation

Semantic Scholar Search Workflow

Semantic Scholar 搜索工作流

Phase 1: Understand & Plan

阶段1：理解与规划

Decision Tree

决策树

Query Construction Rules

查询构建规则

Plan Filters

筛选器规划

Phase 2: Execute Search

阶段2：执行搜索

--- Standard prelude (use in every script) ---

--- Standard prelude (use in every script) ---

--- end prelude ---

--- end prelude ---

Build precise query

Build precise query

Worked Examples

示例实践

Use the first match's ID to get their papers

使用第一个匹配结果的ID获取其论文

Filter locally for topic

本地筛选主题

Citation envelopes carry contextsWithIntent — keep them, don't flatten.

引用信封包含contextsWithIntent — 保留完整结构，不要扁平化。

Export BibTeX for top results

导出前10条结果的BibTeX

Phase 3: Summarize & Present

阶段3：总结与展示

Phase 4: User Interaction Loop

阶段4：用户交互循环

API Quick Reference

API快速参考

Helper Module (s2.py)

辅助模块（s2.py）

Paper Search Functions

论文搜索函数

Author Functions

作者相关函数

Filter Parameters (kwargs)

筛选器参数（关键字参数）

Boolean Query Syntax (bulk search only)

布尔查询语法（仅批量搜索可用）

Output Functions

输出函数

Supported ID Formats

支持的ID格式

Paper Fields

论文字段

Rate Limiting

速率限制

Troubleshooting

故障排除

Helper Module (
`s2.py`
)

辅助模块（
`s2.py`
）