semantic-scholar
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSemantic Scholar Paper Search
Semantic Scholar 论文搜索
Search topic or paper ID: $ARGUMENTS
搜索主题或论文ID:$ARGUMENTS
Role & Positioning
角色与定位
This skill is the published venue counterpart to :
/arxiv| Skill | Source | Best for |
|---|---|---|
| arXiv API | Latest preprints, cutting-edge unrefereed work |
| Semantic Scholar API | Published journal/conference papers (IEEE, ACM, Springer, etc.) with citation counts, venue info, TLDR |
Do NOT duplicate arXiv's job. If results contain an field, the paper is also on arXiv — note this but do not re-fetch from arXiv.
externalIds.ArXiv本技能是技能在已发表期刊/会议论文领域的对应工具:
/arxiv| 技能 | 数据源 | 适用场景 |
|---|---|---|
| arXiv API | 最新预印本、前沿未评审研究 |
| Semantic Scholar API | 已发表的期刊/会议论文(IEEE、ACM、Springer等),附带引用次数、期刊信息和TLDR内容 |
请勿重复arXiv的工作。如果结果包含字段,说明该论文也在arXiv上存在——只需标注这一点,无需通过arXiv重新获取内容。
externalIds.ArXivConstants
常量定义
- MAX_RESULTS = 10 — Default number of search results.
- FETCH_SCRIPT — relative to the project root. Fall back to inline Python if not found.
tools/semantic_scholar_fetch.py - DEFAULT_FILTERS — For general research queries, apply these by default to reduce noise:
--fields-of-study "Computer Science,Engineering"--publication-types JournalArticle,Conference
Overrides (append to arguments):
— return up to 20 results/semantic-scholar "topic" - max: 20 — only journal articles/semantic-scholar "topic" - type: journal — only conference papers/semantic-scholar "topic" - type: conference — only highly-cited papers/semantic-scholar "topic" - min-citations: 50 — papers from 2022 onward/semantic-scholar "topic" - year: 2022- — remove default field-of-study filter/semantic-scholar "topic" - fields: all — bulk search sorted by citation count/semantic-scholar "topic" - sort: citations — fetch a single paper by DOI/semantic-scholar "DOI:10.1109/..."
- MAX_RESULTS = 10 — 默认搜索结果数量。
- FETCH_SCRIPT — 相对于项目根目录的。如果未找到该脚本,则使用内置Python代码作为替代。
tools/semantic_scholar_fetch.py - DEFAULT_FILTERS — 针对通用研究查询,默认应用以下过滤器以减少无关结果:
--fields-of-study "Computer Science,Engineering"--publication-types JournalArticle,Conference
参数覆盖(追加到参数末尾):
— 返回最多20条结果/semantic-scholar "主题" - max: 20 — 仅返回期刊论文/semantic-scholar "主题" - type: journal — 仅返回会议论文/semantic-scholar "主题" - type: conference — 仅返回高引用论文(引用数≥50)/semantic-scholar "主题" - min-citations: 50 — 返回2022年及以后的论文/semantic-scholar "主题" - year: 2022- — 移除默认的学科过滤器/semantic-scholar "主题" - fields: all — 按引用数排序批量搜索结果/semantic-scholar "主题" - sort: citations — 通过DOI获取单篇论文/semantic-scholar "DOI:10.1109/..."
Workflow
工作流程
Step 1: Parse Arguments
步骤1:解析参数
Parse for directives:
$ARGUMENTS- Query or ID: main search term, or a paper identifier:
- DOI:
10.1109/TWC.2024.1234567 - Semantic Scholar ID:
f9314fd99be5f2b1b3efcfab87197d578160d553 - ArXiv:
ARXIV:2006.10685 - Corpus:
CorpusId:219792180
- DOI:
- : override MAX_RESULTS
- max: N - : map to
- type: journal|conference|review|all--publication-types - : map to
- min-citations: N--min-citations - : map to
- year: RANGE(e.g.--year,2022-)2020-2024 - : override
- fields: FIELDS(use--fields-of-studyto remove filter)all - : use
- sort: citations|datewithsearch-bulkor--sort citationCount:descpublicationDate:desc
If the argument matches a DOI pattern (), a Semantic Scholar ID (40-char hex), or a prefixed ID (, ), skip search and go directly to Step 3.
10.XXXX/...ARXIV:...CorpusId:...解析中的指令:
$ARGUMENTS- 查询词或ID:主要搜索词,或论文标识符:
- DOI:
10.1109/TWC.2024.1234567 - Semantic Scholar ID:
f9314fd99be5f2b1b3efcfab87197d578160d553 - ArXiv ID:
ARXIV:2006.10685 - Corpus ID:
CorpusId:219792180
- DOI:
- :覆盖MAX_RESULTS常量
- max: N - :映射为
- type: journal|conference|review|all参数--publication-types - :映射为
- min-citations: N参数--min-citations - :映射为
- year: RANGE参数(例如--year、2022-)2020-2024 - :覆盖
- fields: FIELDS参数(使用--fields-of-study移除过滤器)all - :使用
- sort: citations|date模式,搭配search-bulk或--sort citationCount:desc参数publicationDate:desc
如果参数匹配DOI格式()、Semantic Scholar ID(40位十六进制字符串)或带前缀的ID(、),则跳过搜索直接进入步骤3。
10.XXXX/...ARXIV:...CorpusId:...Step 2: Search Papers
步骤2:搜索论文
Locate the fetch script:
bash
SCRIPT=$(find tools/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)
[ -z "$SCRIPT" ] && SCRIPT=$(find ~/.claude/skills/semantic-scholar/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)Standard search (default — relevance-ranked):
bash
python3 "$SCRIPT" search "QUERY" --max MAX_RESULTS \
--fields-of-study "Computer Science,Engineering" \
--publication-types JournalArticle,ConferenceBulk search (when is specified, or MAX_RESULTS > 100):
- sort:bash
python3 "$SCRIPT" search-bulk "QUERY" --max MAX_RESULTS \
--sort citationCount:desc \
--fields-of-study "Computer Science" \
--year "2020-"If is not found, fall back to inline Python using against .
semantic_scholar_fetch.pyurllibhttps://api.semanticscholar.org/graph/v1/paper/searchRecommended filter combos (from testing):
| Goal | Flags |
|---|---|
| High-quality journal papers | |
| CS/EE papers, recent | |
| Foundational / high-impact | |
| Conference papers only | |
Note:requires exact venue names (e.g. "IEEE Transactions on Signal Processing"), not partial matches like "IEEE". Avoid using--venuein automated flows — prefer--venue+--publication-types.--fields-of-study
定位获取脚本:
bash
SCRIPT=$(find tools/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)
[ -z "$SCRIPT" ] && SCRIPT=$(find ~/.claude/skills/semantic-scholar/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)标准搜索(默认——按相关性排序):
bash
python3 "$SCRIPT" search "QUERY" --max MAX_RESULTS \
--fields-of-study "Computer Science,Engineering" \
--publication-types JournalArticle,Conference批量搜索(当指定参数,或MAX_RESULTS>100时):
- sort:bash
python3 "$SCRIPT" search-bulk "QUERY" --max MAX_RESULTS \
--sort citationCount:desc \
--fields-of-study "Computer Science" \
--year "2020-"如果未找到脚本,则回退到使用内置Python代码,通过调用接口。
semantic_scholar_fetch.pyurllibhttps://api.semanticscholar.org/graph/v1/paper/search推荐过滤器组合(经测试验证):
| 目标 | 参数 |
|---|---|
| 高质量期刊论文 | |
| 计算机/工程领域近期论文 | |
| 基础性/高影响力论文 | |
| 仅会议论文 | |
注意:参数需要精确的期刊/会议名称(例如"IEEE Transactions on Signal Processing"),不支持部分匹配(如"IEEE")。在自动化流程中避免使用--venue参数——优先使用--venue+--publication-types组合。--fields-of-study
Step 3: Fetch Details for a Specific Paper
步骤3:获取单篇论文详情
When a single paper ID is requested:
bash
python3 "$SCRIPT" paper "PAPER_ID"Where PAPER_ID can be:
- DOI:
10.1109/TSP.2021.3071210 - ArXiv:
ARXIV:2006.10685 - CorpusId:
CorpusId:219792180 - S2 ID:
f9314fd99be5f2b1b3efcfab87197d578160d553
当用户请求单篇论文ID时:
bash
python3 "$SCRIPT" paper "PAPER_ID"其中PAPER_ID可以是:
- DOI:
10.1109/TSP.2021.3071210 - ArXiv ID:
ARXIV:2006.10685 - Corpus ID:
CorpusId:219792180 - S2 ID:
f9314fd99be5f2b1b3efcfab87197d578160d553
Step 4: De-duplicate Against arXiv
步骤4:与arXiv去重
For each result, check :
externalIds.ArXiv- If present → paper is also on arXiv. Note this in output but do NOT re-fetch via .
/arxiv - If absent → paper is venue-only (e.g. IEEE without preprint). This is the unique value of this skill.
对每条结果,检查字段:
externalIds.ArXiv- 如果存在→该论文也在arXiv上。在输出中标注这一点,但不要通过重新获取内容。
/arxiv - 如果不存在→该论文是仅期刊/会议发表(例如未发布预印本的IEEE论文)。这是本技能的核心独特价值。
Step 5: Present Results
步骤5:展示结果
Present results as a table:
text
| # | Title | Venue | Year | Citations | Authors | Type |
|---|-------|-------|------|-----------|---------|------|
| 1 | Deep Learning Enabled... | IEEE Trans. Signal Process. | 2021 | 1364 | Xie et al. | Journal |For each paper, also show:
- DOI link: (for IEEE/ACM papers, this is the canonical link)
https://doi.org/DOI - Open Access PDF: if is non-empty, show it
openAccessPdf.url - TLDR: if available, show the one-line summary
- Also on arXiv: if exists, note the arXiv ID
externalIds.ArXiv
以表格形式展示结果:
text
| 序号 | 标题 | 期刊/会议 | 年份 | 引用数 | 作者 | 类型 |
|---|-------|-------|------|-----------|---------|------|
| 1 | 深度学习驱动的... | IEEE Trans. Signal Process. | 2021 | 1364 | Xie et al. | 期刊 |对每篇论文,还需展示:
- DOI链接:(对于IEEE/ACM论文,这是标准官方链接)
https://doi.org/DOI - 开放获取PDF:如果不为空,则展示该链接
openAccessPdf.url - TLDR:如果存在,则展示单行摘要
- 同时在arXiv上:如果存在,则标注arXiv ID
externalIds.ArXiv
Step 6: Detailed Summary
步骤6:详细摘要
For each paper (or top 5 if many results):
markdown
undefined对每篇论文(如果结果较多则展示前5篇):
markdown
undefined[Title]
[论文标题]
- Venue: [venue name] ([publicationVenue.type]: journal/conference)
- Year: [year] | Citations: [citationCount]
- Authors: [full author list]
- DOI: [doi link]
- Fields: [fieldsOfStudy]
- TLDR: [tldr.text if available]
- Abstract: [abstract]
- Open Access: [openAccessPdf.url or "Not available"]
- Also on arXiv: [ArXiv ID if exists, else "No"]
undefined- 期刊/会议:[名称] ([publicationVenue.type]: 期刊/会议)
- 年份:[年份] | 引用数:[citationCount]
- 作者:[完整作者列表]
- DOI:[DOI链接]
- 学科领域:[fieldsOfStudy]
- TLDR:[tldr.text(如果存在)]
- 摘要:[摘要内容]
- 开放获取:[openAccessPdf.url或“不可用”]
- 同时在arXiv上:[ArXiv ID(如果存在),否则为“否”]
undefinedStep 7: Update Research Wiki (if active)
步骤7:更新研究维基(如果启用)
Required when exists in the project; skip silently
otherwise. Ingest the papers presented to the user. For results with an
field, use ; for venue-only papers (no
arXiv mirror — common for IEEE/ACM), fall back to manual metadata:
research-wiki/externalIds.ArXiv--arxiv-idif [ -d research-wiki/ ]:
for each paper in results:
if paper.externalIds.ArXiv:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--arxiv-id "<ArXiv>"
else:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--title "<title>" --authors "<authors joined by , >" \
--year <year> --venue "<venue>" \
[--external-id-doi "<externalIds.DOI>"]The helper handles slug / dedup / page / index / log — do not
handwrite . See
.
Backfill with for
arXiv-available papers.
papers/<slug>.mdshared-references/integration-contract.md/research-wiki sync --arxiv-ids <id1>,<id2>,...当项目中存在目录时必须执行;否则静默跳过。将展示给用户的论文信息导入维基。对于带有字段的结果,使用参数;对于仅期刊/会议发表的论文(无arXiv镜像——IEEE/ACM论文常见情况),回退到手动导入元数据:
research-wiki/externalIds.ArXiv--arxiv-idif [ -d research-wiki/ ]:
for each paper in results:
if paper.externalIds.ArXiv:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--arxiv-id "<ArXiv>"
else:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--title "<标题>" --authors "<作者列表,用逗号分隔>" \
--year <年份> --venue "<期刊/会议>" \
[--external-id-doi "<externalIds.DOI>"]辅助工具会处理slug生成、去重、页面创建、索引更新和日志记录——请勿手动编写文件。详见。对于在arXiv上可用的论文,可使用进行回填。
papers/<slug>.mdshared-references/integration-contract.md/research-wiki sync --arxiv-ids <id1>,<id2>,...Step 8: Final Output
步骤8:最终输出
Summarize what was done:
Found N published papers for "query"Filters applied: [publication types, fields, year range, etc.]N papers are venue-only (not on arXiv)- (if
Wiki-ingested N paperswas present)research-wiki/
Suggest follow-up skills:
text
/arxiv "topic" - search arXiv preprints (complements this search)
/research-lit "topic" - multi-source review: Zotero + local PDFs + arXiv + S2
/novelty-check "idea" - verify novelty against literature总结已完成的操作:
为“查询词”找到N篇已发表论文应用的过滤器:[出版类型、学科领域、年份范围等]N篇论文为仅期刊/会议发表(未在arXiv上发布)- (如果存在
已将N篇论文导入维基目录)research-wiki/
推荐后续可使用的技能:
text
/arxiv "主题" - 搜索arXiv预印本(与本搜索互补)
/research-lit "主题" - 多来源综述:Zotero + 本地PDF + arXiv + S2
/novelty-check "想法" - 针对文献验证创新性Key Rules
核心规则
- Default to filtered search: Always apply and
--fields-of-studyunless user says--publication-types. Without filters, S2 returns cross-discipline noise (linguistics, psychology, etc.).- fields: all - Citation count is gold: S2's citation data is its main advantage over arXiv. Always show prominently and use it to rank/prioritize results.
citationCount - Venue metadata matters: Show and
venue(journal vs conference) — this helps users assess paper quality.publicationVenue.type - DOI is the canonical ID for published papers: Always show DOI links for IEEE/ACM/Springer papers.
- Rate limiting: S2 API without key is heavily rate-limited (~1 req/s, strict cooldown). If HTTP 429 occurs, wait and retry. Recommend users set env var for higher limits (free at https://www.semanticscholar.org/product/api#api-key-form).
SEMANTIC_SCHOLAR_API_KEY - TLDR may be null: Some publishers (notably IEEE) elide the TLDR field. Fall back to showing the first sentence of the abstract.
- openAccessPdf may be empty: Many IEEE papers are closed access. Always provide the DOI link as fallback.
- If the S2 API is unreachable, suggest using or
/arxivas fallback./research-lit "topic" - sources: web
- 默认使用过滤搜索:除非用户指定,否则始终应用
- fields: all和--fields-of-study过滤器。如果不使用过滤器,S2会返回跨学科的无关结果(如语言学、心理学等)。--publication-types - 引用数是关键指标:S2的引用数据是其相对arXiv的主要优势。始终突出显示,并将其作为结果排序/优先级划分的依据。
citationCount - 期刊/会议元数据很重要:展示和
venue(期刊vs会议)——这有助于用户评估论文质量。publicationVenue.type - DOI是已发表论文的标准ID:始终为IEEE/ACM/Springer论文展示DOI链接。
- 速率限制:未使用API密钥的S2 API有严格的速率限制(约1次请求/秒,冷却时间严格)。如果出现HTTP 429错误,请等待后重试。建议用户设置环境变量以获得更高的请求限制(可在https://www.semanticscholar.org/product/api#api-key-form免费获取)。
SEMANTIC_SCHOLAR_API_KEY - TLDR可能为空:部分出版商(尤其是IEEE)不会提供TLDR字段。此时回退到展示摘要的第一句话。
- openAccessPdf可能为空:许多IEEE论文是闭源访问的。始终提供DOI链接作为替代方案。
- 如果S2 API无法访问,建议使用或
/arxiv作为替代方案。/research-lit "主题" - sources: web