semantic-scholar

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Semantic Scholar Paper Search

Semantic Scholar 论文搜索

Search topic or paper ID: $ARGUMENTS
搜索主题或论文ID:$ARGUMENTS

Role & Positioning

角色与定位

This skill is the published venue counterpart to
/arxiv
:
SkillSourceBest for
/arxiv
arXiv APILatest preprints, cutting-edge unrefereed work
/semantic-scholar
Semantic Scholar APIPublished journal/conference papers (IEEE, ACM, Springer, etc.) with citation counts, venue info, TLDR
Do NOT duplicate arXiv's job. If results contain an
externalIds.ArXiv
field, the paper is also on arXiv — note this but do not re-fetch from arXiv.
本技能是
/arxiv
技能在已发表期刊/会议论文领域的对应工具:
技能数据源适用场景
/arxiv
arXiv API最新预印本、前沿未评审研究
/semantic-scholar
Semantic Scholar API已发表的期刊/会议论文(IEEE、ACM、Springer等),附带引用次数、期刊信息和TLDR内容
请勿重复arXiv的工作。如果结果包含
externalIds.ArXiv
字段,说明该论文也在arXiv上存在——只需标注这一点,无需通过arXiv重新获取内容。

Constants

常量定义

  • MAX_RESULTS = 10 — Default number of search results.
  • FETCH_SCRIPT
    tools/semantic_scholar_fetch.py
    relative to the project root. Fall back to inline Python if not found.
  • DEFAULT_FILTERS — For general research queries, apply these by default to reduce noise:
    • --fields-of-study "Computer Science,Engineering"
    • --publication-types JournalArticle,Conference
Overrides (append to arguments):
  • /semantic-scholar "topic" - max: 20
    — return up to 20 results
  • /semantic-scholar "topic" - type: journal
    — only journal articles
  • /semantic-scholar "topic" - type: conference
    — only conference papers
  • /semantic-scholar "topic" - min-citations: 50
    — only highly-cited papers
  • /semantic-scholar "topic" - year: 2022-
    — papers from 2022 onward
  • /semantic-scholar "topic" - fields: all
    — remove default field-of-study filter
  • /semantic-scholar "topic" - sort: citations
    — bulk search sorted by citation count
  • /semantic-scholar "DOI:10.1109/..."
    — fetch a single paper by DOI
  • MAX_RESULTS = 10 — 默认搜索结果数量。
  • FETCH_SCRIPT — 相对于项目根目录的
    tools/semantic_scholar_fetch.py
    。如果未找到该脚本,则使用内置Python代码作为替代。
  • DEFAULT_FILTERS — 针对通用研究查询,默认应用以下过滤器以减少无关结果:
    • --fields-of-study "Computer Science,Engineering"
    • --publication-types JournalArticle,Conference
参数覆盖(追加到参数末尾):
  • /semantic-scholar "主题" - max: 20
    — 返回最多20条结果
  • /semantic-scholar "主题" - type: journal
    — 仅返回期刊论文
  • /semantic-scholar "主题" - type: conference
    — 仅返回会议论文
  • /semantic-scholar "主题" - min-citations: 50
    — 仅返回高引用论文(引用数≥50)
  • /semantic-scholar "主题" - year: 2022-
    — 返回2022年及以后的论文
  • /semantic-scholar "主题" - fields: all
    — 移除默认的学科过滤器
  • /semantic-scholar "主题" - sort: citations
    — 按引用数排序批量搜索结果
  • /semantic-scholar "DOI:10.1109/..."
    — 通过DOI获取单篇论文

Workflow

工作流程

Step 1: Parse Arguments

步骤1:解析参数

Parse
$ARGUMENTS
for directives:
  • Query or ID: main search term, or a paper identifier:
    • DOI:
      10.1109/TWC.2024.1234567
    • Semantic Scholar ID:
      f9314fd99be5f2b1b3efcfab87197d578160d553
    • ArXiv:
      ARXIV:2006.10685
    • Corpus:
      CorpusId:219792180
  • - max: N
    : override MAX_RESULTS
  • - type: journal|conference|review|all
    : map to
    --publication-types
  • - min-citations: N
    : map to
    --min-citations
  • - year: RANGE
    : map to
    --year
    (e.g.
    2022-
    ,
    2020-2024
    )
  • - fields: FIELDS
    : override
    --fields-of-study
    (use
    all
    to remove filter)
  • - sort: citations|date
    : use
    search-bulk
    with
    --sort citationCount:desc
    or
    publicationDate:desc
If the argument matches a DOI pattern (
10.XXXX/...
), a Semantic Scholar ID (40-char hex), or a prefixed ID (
ARXIV:...
,
CorpusId:...
), skip search and go directly to Step 3.
解析
$ARGUMENTS
中的指令:
  • 查询词或ID:主要搜索词,或论文标识符:
    • DOI:
      10.1109/TWC.2024.1234567
    • Semantic Scholar ID:
      f9314fd99be5f2b1b3efcfab87197d578160d553
    • ArXiv ID:
      ARXIV:2006.10685
    • Corpus ID:
      CorpusId:219792180
  • - max: N
    :覆盖MAX_RESULTS常量
  • - type: journal|conference|review|all
    :映射为
    --publication-types
    参数
  • - min-citations: N
    :映射为
    --min-citations
    参数
  • - year: RANGE
    :映射为
    --year
    参数(例如
    2022-
    2020-2024
  • - fields: FIELDS
    :覆盖
    --fields-of-study
    参数(使用
    all
    移除过滤器)
  • - sort: citations|date
    :使用
    search-bulk
    模式,搭配
    --sort citationCount:desc
    publicationDate:desc
    参数
如果参数匹配DOI格式(
10.XXXX/...
)、Semantic Scholar ID(40位十六进制字符串)或带前缀的ID(
ARXIV:...
CorpusId:...
),则跳过搜索直接进入步骤3。

Step 2: Search Papers

步骤2:搜索论文

Locate the fetch script:
bash
SCRIPT=$(find tools/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)
[ -z "$SCRIPT" ] && SCRIPT=$(find ~/.claude/skills/semantic-scholar/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)
Standard search (default — relevance-ranked):
bash
python3 "$SCRIPT" search "QUERY" --max MAX_RESULTS \
  --fields-of-study "Computer Science,Engineering" \
  --publication-types JournalArticle,Conference
Bulk search (when
- sort:
is specified, or MAX_RESULTS > 100):
bash
python3 "$SCRIPT" search-bulk "QUERY" --max MAX_RESULTS \
  --sort citationCount:desc \
  --fields-of-study "Computer Science" \
  --year "2020-"
If
semantic_scholar_fetch.py
is not found, fall back to inline Python using
urllib
against
https://api.semanticscholar.org/graph/v1/paper/search
.
Recommended filter combos (from testing):
GoalFlags
High-quality journal papers
--publication-types JournalArticle --min-citations 10
CS/EE papers, recent
--fields-of-study "Computer Science,Engineering" --year "2022-"
Foundational / high-impact
search-bulk --sort citationCount:desc --fields-of-study "Computer Science"
Conference papers only
--publication-types Conference
Note:
--venue
requires exact venue names (e.g. "IEEE Transactions on Signal Processing"), not partial matches like "IEEE". Avoid using
--venue
in automated flows — prefer
--publication-types
+
--fields-of-study
.
定位获取脚本:
bash
SCRIPT=$(find tools/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)
[ -z "$SCRIPT" ] && SCRIPT=$(find ~/.claude/skills/semantic-scholar/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)
标准搜索(默认——按相关性排序):
bash
python3 "$SCRIPT" search "QUERY" --max MAX_RESULTS \
  --fields-of-study "Computer Science,Engineering" \
  --publication-types JournalArticle,Conference
批量搜索(当指定
- sort:
参数,或MAX_RESULTS>100时):
bash
python3 "$SCRIPT" search-bulk "QUERY" --max MAX_RESULTS \
  --sort citationCount:desc \
  --fields-of-study "Computer Science" \
  --year "2020-"
如果未找到
semantic_scholar_fetch.py
脚本,则回退到使用内置Python代码,通过
urllib
调用
https://api.semanticscholar.org/graph/v1/paper/search
接口。
推荐过滤器组合(经测试验证):
目标参数
高质量期刊论文
--publication-types JournalArticle --min-citations 10
计算机/工程领域近期论文
--fields-of-study "Computer Science,Engineering" --year "2022-"
基础性/高影响力论文
search-bulk --sort citationCount:desc --fields-of-study "Computer Science"
仅会议论文
--publication-types Conference
注意
--venue
参数需要精确的期刊/会议名称(例如"IEEE Transactions on Signal Processing"),不支持部分匹配(如"IEEE")。在自动化流程中避免使用
--venue
参数——优先使用
--publication-types
+
--fields-of-study
组合。

Step 3: Fetch Details for a Specific Paper

步骤3:获取单篇论文详情

When a single paper ID is requested:
bash
python3 "$SCRIPT" paper "PAPER_ID"
Where PAPER_ID can be:
  • DOI:
    10.1109/TSP.2021.3071210
  • ArXiv:
    ARXIV:2006.10685
  • CorpusId:
    CorpusId:219792180
  • S2 ID:
    f9314fd99be5f2b1b3efcfab87197d578160d553
当用户请求单篇论文ID时:
bash
python3 "$SCRIPT" paper "PAPER_ID"
其中PAPER_ID可以是:
  • DOI:
    10.1109/TSP.2021.3071210
  • ArXiv ID:
    ARXIV:2006.10685
  • Corpus ID:
    CorpusId:219792180
  • S2 ID:
    f9314fd99be5f2b1b3efcfab87197d578160d553

Step 4: De-duplicate Against arXiv

步骤4:与arXiv去重

For each result, check
externalIds.ArXiv
:
  • If present → paper is also on arXiv. Note this in output but do NOT re-fetch via
    /arxiv
    .
  • If absent → paper is venue-only (e.g. IEEE without preprint). This is the unique value of this skill.
对每条结果,检查
externalIds.ArXiv
字段:
  • 如果存在→该论文也在arXiv上。在输出中标注这一点,但不要通过
    /arxiv
    重新获取内容。
  • 如果不存在→该论文是仅期刊/会议发表(例如未发布预印本的IEEE论文)。这是本技能的核心独特价值。

Step 5: Present Results

步骤5:展示结果

Present results as a table:
text
| # | Title | Venue | Year | Citations | Authors | Type |
|---|-------|-------|------|-----------|---------|------|
| 1 | Deep Learning Enabled... | IEEE Trans. Signal Process. | 2021 | 1364 | Xie et al. | Journal |
For each paper, also show:
  • DOI link:
    https://doi.org/DOI
    (for IEEE/ACM papers, this is the canonical link)
  • Open Access PDF: if
    openAccessPdf.url
    is non-empty, show it
  • TLDR: if available, show the one-line summary
  • Also on arXiv: if
    externalIds.ArXiv
    exists, note the arXiv ID
以表格形式展示结果:
text
| 序号 | 标题 | 期刊/会议 | 年份 | 引用数 | 作者 | 类型 |
|---|-------|-------|------|-----------|---------|------|
| 1 | 深度学习驱动的... | IEEE Trans. Signal Process. | 2021 | 1364 | Xie et al. | 期刊 |
对每篇论文,还需展示:
  • DOI链接
    https://doi.org/DOI
    (对于IEEE/ACM论文,这是标准官方链接)
  • 开放获取PDF:如果
    openAccessPdf.url
    不为空,则展示该链接
  • TLDR:如果存在,则展示单行摘要
  • 同时在arXiv上:如果
    externalIds.ArXiv
    存在,则标注arXiv ID

Step 6: Detailed Summary

步骤6:详细摘要

For each paper (or top 5 if many results):
markdown
undefined
对每篇论文(如果结果较多则展示前5篇):
markdown
undefined

[Title]

[论文标题]

  • Venue: [venue name] ([publicationVenue.type]: journal/conference)
  • Year: [year] | Citations: [citationCount]
  • Authors: [full author list]
  • DOI: [doi link]
  • Fields: [fieldsOfStudy]
  • TLDR: [tldr.text if available]
  • Abstract: [abstract]
  • Open Access: [openAccessPdf.url or "Not available"]
  • Also on arXiv: [ArXiv ID if exists, else "No"]
undefined
  • 期刊/会议:[名称] ([publicationVenue.type]: 期刊/会议)
  • 年份:[年份] | 引用数:[citationCount]
  • 作者:[完整作者列表]
  • DOI:[DOI链接]
  • 学科领域:[fieldsOfStudy]
  • TLDR:[tldr.text(如果存在)]
  • 摘要:[摘要内容]
  • 开放获取:[openAccessPdf.url或“不可用”]
  • 同时在arXiv上:[ArXiv ID(如果存在),否则为“否”]
undefined

Step 7: Update Research Wiki (if active)

步骤7:更新研究维基(如果启用)

Required when
research-wiki/
exists in the project
; skip silently otherwise. Ingest the papers presented to the user. For results with an
externalIds.ArXiv
field, use
--arxiv-id
; for venue-only papers (no arXiv mirror — common for IEEE/ACM), fall back to manual metadata:
if [ -d research-wiki/ ]:
    for each paper in results:
        if paper.externalIds.ArXiv:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --arxiv-id "<ArXiv>"
        else:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --title "<title>" --authors "<authors joined by , >" \
                --year <year> --venue "<venue>" \
                [--external-id-doi "<externalIds.DOI>"]
The helper handles slug / dedup / page / index / log — do not handwrite
papers/<slug>.md
. See
shared-references/integration-contract.md
. Backfill with
/research-wiki sync --arxiv-ids <id1>,<id2>,...
for arXiv-available papers.
当项目中存在
research-wiki/
目录时必须执行
;否则静默跳过。将展示给用户的论文信息导入维基。对于带有
externalIds.ArXiv
字段的结果,使用
--arxiv-id
参数;对于仅期刊/会议发表的论文(无arXiv镜像——IEEE/ACM论文常见情况),回退到手动导入元数据:
if [ -d research-wiki/ ]:
    for each paper in results:
        if paper.externalIds.ArXiv:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --arxiv-id "<ArXiv>"
        else:
            python3 tools/research_wiki.py ingest_paper research-wiki/ \
                --title "<标题>" --authors "<作者列表,用逗号分隔>" \
                --year <年份> --venue "<期刊/会议>" \
                [--external-id-doi "<externalIds.DOI>"]
辅助工具会处理slug生成、去重、页面创建、索引更新和日志记录——请勿手动编写
papers/<slug>.md
文件
。详见
shared-references/integration-contract.md
。对于在arXiv上可用的论文,可使用
/research-wiki sync --arxiv-ids <id1>,<id2>,...
进行回填。

Step 8: Final Output

步骤8:最终输出

Summarize what was done:
  • Found N published papers for "query"
  • Filters applied: [publication types, fields, year range, etc.]
  • N papers are venue-only (not on arXiv)
  • Wiki-ingested N papers
    (if
    research-wiki/
    was present)
Suggest follow-up skills:
text
/arxiv "topic"           - search arXiv preprints (complements this search)
/research-lit "topic"    - multi-source review: Zotero + local PDFs + arXiv + S2
/novelty-check "idea"    - verify novelty against literature
总结已完成的操作:
  • 为“查询词”找到N篇已发表论文
  • 应用的过滤器:[出版类型、学科领域、年份范围等]
  • N篇论文为仅期刊/会议发表(未在arXiv上发布)
  • 已将N篇论文导入维基
    (如果存在
    research-wiki/
    目录)
推荐后续可使用的技能:
text
/arxiv "主题"           - 搜索arXiv预印本(与本搜索互补)
/research-lit "主题"    - 多来源综述:Zotero + 本地PDF + arXiv + S2
/novelty-check "想法"    - 针对文献验证创新性

Key Rules

核心规则

  • Default to filtered search: Always apply
    --fields-of-study
    and
    --publication-types
    unless user says
    - fields: all
    . Without filters, S2 returns cross-discipline noise (linguistics, psychology, etc.).
  • Citation count is gold: S2's citation data is its main advantage over arXiv. Always show
    citationCount
    prominently and use it to rank/prioritize results.
  • Venue metadata matters: Show
    venue
    and
    publicationVenue.type
    (journal vs conference) — this helps users assess paper quality.
  • DOI is the canonical ID for published papers: Always show DOI links for IEEE/ACM/Springer papers.
  • Rate limiting: S2 API without key is heavily rate-limited (~1 req/s, strict cooldown). If HTTP 429 occurs, wait and retry. Recommend users set
    SEMANTIC_SCHOLAR_API_KEY
    env var for higher limits (free at https://www.semanticscholar.org/product/api#api-key-form).
  • TLDR may be null: Some publishers (notably IEEE) elide the TLDR field. Fall back to showing the first sentence of the abstract.
  • openAccessPdf may be empty: Many IEEE papers are closed access. Always provide the DOI link as fallback.
  • If the S2 API is unreachable, suggest using
    /arxiv
    or
    /research-lit "topic" - sources: web
    as fallback.
  • 默认使用过滤搜索:除非用户指定
    - fields: all
    ,否则始终应用
    --fields-of-study
    --publication-types
    过滤器。如果不使用过滤器,S2会返回跨学科的无关结果(如语言学、心理学等)。
  • 引用数是关键指标:S2的引用数据是其相对arXiv的主要优势。始终突出显示
    citationCount
    ,并将其作为结果排序/优先级划分的依据。
  • 期刊/会议元数据很重要:展示
    venue
    publicationVenue.type
    (期刊vs会议)——这有助于用户评估论文质量。
  • DOI是已发表论文的标准ID:始终为IEEE/ACM/Springer论文展示DOI链接。
  • 速率限制:未使用API密钥的S2 API有严格的速率限制(约1次请求/秒,冷却时间严格)。如果出现HTTP 429错误,请等待后重试。建议用户设置
    SEMANTIC_SCHOLAR_API_KEY
    环境变量以获得更高的请求限制(可在https://www.semanticscholar.org/product/api#api-key-form免费获取)。
  • TLDR可能为空:部分出版商(尤其是IEEE)不会提供TLDR字段。此时回退到展示摘要的第一句话。
  • openAccessPdf可能为空:许多IEEE论文是闭源访问的。始终提供DOI链接作为替代方案。
  • 如果S2 API无法访问,建议使用
    /arxiv
    /research-lit "主题" - sources: web
    作为替代方案。