asta-skill

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Asta MCP — Academic Paper Search

Asta MCP — 学术论文搜索

Asta is Ai2's Scientific Corpus Tool, exposing the Semantic Scholar academic graph over MCP (streamable HTTP transport). This skill tells agents which Asta tool to call for which intent, and how to compose them into useful workflows.
Asta是Ai2推出的科学语料库工具,通过MCP(可流式传输的HTTP协议)对外暴露Semantic Scholar学术图谱。本技能指导Agent针对不同意图调用对应的Asta工具,以及如何将这些工具组合成实用的工作流。

Prerequisite Check

前置检查

Before invoking any tool, verify the Asta MCP server is registered in the host agent. Tool names will be prefixed by the MCP server name chosen at install time (commonly
asta__<tool>
or
mcp__asta__<tool>
). If no Asta tools are visible, direct the user to the Installation section below.
调用任何工具前,请确认Asta MCP服务器已在宿主Agent中注册。工具名称会带有安装时选择的MCP服务器名称前缀(通常为
asta__<tool>
mcp__asta__<tool>
)。如果未显示任何Asta工具,请引导用户查看下方的安装部分。

Tool Map — Intent → Asta Tool

工具映射 — 意图 → Asta工具

User intentAsta toolNotes
Broad topic search
search_papers_by_relevance
Supports venue + date filters
Known paper title
search_paper_by_title
Optional venue restriction
Known DOI / arXiv / PMID / CorpusId / MAG / ACL / SHA / URL
get_paper
Single-paper lookup
Multiple known IDs at once
get_paper_batch
Batch lookup — prefer over N sequential
get_paper
calls
Who cited paper X
get_citations
Citation traversal with filters, paginated
Find author by name
search_authors_by_name
Returns profile info
An author's publications
get_author_papers
Pass author id from previous call
Find passages mentioning X
snippet_search
~500-word excerpts from paper bodies
Search/citation tools accept
publication_date_range
(format
YYYY-MM-DD:YYYY-MM-DD
; year shorthand like
"2021:"
,
":2015-01"
,
"2015:2020"
is also accepted) and
venues
(comma-separated) filters, plus
fields
for field selection — pass them whenever the user's intent constrains scope (e.g., "recent", "since 2022", "at NeurIPS").
用户意图Asta工具说明
宽泛主题搜索
search_papers_by_relevance
支持会议/期刊筛选 + 日期筛选
已知论文标题
search_paper_by_title
可选会议/期刊限制
已知DOI / arXiv / PMID / CorpusId / MAG / ACL / SHA / URL
get_paper
单篇论文查询
批量查询已知ID
get_paper_batch
批量查询 — 优先使用该工具,而非连续调用N次
get_paper
哪些论文引用了论文X
get_citations
带筛选条件的引用遍历,支持分页
通过姓名查找作者
search_authors_by_name
返回作者个人资料信息
作者的所有出版物
get_author_papers
传入上一步调用返回的作者ID
查找提及X的段落
snippet_search
论文正文中约500词的摘录片段
搜索/引用类工具支持**
publication_date_range
(格式为
YYYY-MM-DD:YYYY-MM-DD
;也支持年份简写,如
"2021:"
":2015-01"
"2015:2020"
)和
venues
(逗号分隔)筛选条件,以及用于字段选择的
fields
**参数——当用户意图存在范围限制时(例如“近期”、“2022年以来”、“NeurIPS会议”),请传入这些参数。

⚠️
fields
parameter — avoid context blowups

⚠️
fields
参数 — 避免上下文溢出

get_paper
/
get_paper_batch
accept a
fields
string. Never request
citations
or
references
via
fields
— a single highly-cited paper (e.g. Attention Is All You Need) returns 200k+ characters and will overflow the agent's context window. Use the dedicated
get_citations
tool for forward citations (it paginates). Asta does not provide a dedicated
get_references
tool — to retrieve a paper's reference list, use
get_paper
with
fields=references
only for papers you know have a small reference list (typically < 100).
Safe default
fields
for
get_paper
:
title,year,authors,venue,tldr,url,abstract
Add
journal
,
publicationDate
,
fieldsOfStudy
,
isOpenAccess
only when needed.
get_paper
/
get_paper_batch
支持
fields
字符串参数。切勿通过
fields
请求
citations
references
字段
——一篇高引用论文(如《Attention Is All You Need》)会返回20万+字符,导致Agent的上下文窗口溢出。如需正向引用,请使用专门的
get_citations
工具(该工具支持分页)。Asta未提供专门的
get_references
工具——若要获取论文的参考文献列表,仅对已知参考文献数量较少的论文(通常少于100篇)使用
get_paper
并设置
fields=references
get_paper
的安全默认
fields
参数:
title,year,authors,venue,tldr,url,abstract
仅在需要时添加
journal
publicationDate
fieldsOfStudy
isOpenAccess
字段。

Retrieving DOI / external IDs (undocumented but supported)

获取DOI / 外部ID(未正式文档化但支持)

Asta's official
fields
list does not include
externalIds
, but the field is transparently passed through to the underlying Semantic Scholar API and works in practice. Add
externalIds
to
fields
to retrieve
DOI
,
PubMed
,
PubMedCentral
,
ArXiv
,
MAG
,
DBLP
,
CorpusId
. Caveats:
  • Not all papers have a DOI — pure arXiv preprints often only return
    ArXiv
    +
    CorpusId
    .
  • get_paper("DOI:...")
    lookup is not 100% reliable; some valid DOIs return
    not found
    . Prefer searching by title first, then reading
    externalIds
    off the result.
  • Since this is undocumented, treat it as best-effort and degrade gracefully if a future Asta release drops it.
Asta官方的
fields
列表不包含
externalIds
,但该字段会透传给底层的Semantic Scholar API,实际可用。在
fields
中添加
externalIds
可获取
DOI
PubMed
PubMedCentral
ArXiv
MAG
DBLP
CorpusId
。注意事项:
  • 并非所有论文都有DOI——纯arXiv预印本通常仅返回
    ArXiv
    +
    CorpusId
  • 通过
    get_paper("DOI:...")
    查询并非100%可靠;部分有效DOI会返回
    not found
    。建议先通过标题搜索,再从结果中读取
    externalIds
  • 由于该功能未正式文档化,请视为尽力而为的功能,若未来Asta版本移除该功能,请优雅降级处理。

Workflow Patterns

工作流模式

Pattern 1 — Topic Discovery

模式1 — 主题探索

  1. search_papers_by_relevance(keyword, publication_date_range="<current_year-5>:", venues=?)
    → initial hits (compute the lower bound from today's date — e.g., in 2026 pass
    publication_date_range="2021:"
    ; adjust or drop the filter if the user asks for older work)
  2. Rank/present top N by citationCount + recency
  3. Offer follow-ups:
    get_citations
    on the most influential, or
    snippet_search
    for specific claims
  1. search_papers_by_relevance(keyword, publication_date_range="<当前年份-5>:", venues=?)
    → 初始搜索结果(根据当前日期计算下限——例如2026年传入
    publication_date_range="2021:"
    ;若用户要求查找更早的文献,可调整或移除该筛选条件)
  2. 按引用量 + 时效性对前N篇结果排序/展示
  3. 提供后续操作选项:对最具影响力的论文执行
    get_citations
    ,或针对特定观点执行
    snippet_search

Pattern 2 — Seed-Paper Expansion

模式2 — 种子论文扩展

  1. get_paper(DOI|arXiv|...)
    → verify seed
  2. get_citations(paperId)
    → forward expansion
  3. Optionally
    search_papers_by_relevance
    with seed title terms for sideways discovery
  4. Deduplicate by paperId before presenting
  1. get_paper(DOI|arXiv|...)
    → 验证种子论文
  2. get_citations(paperId)
    → 正向扩展(查找引用该论文的文献)
  3. 可选:使用种子论文标题中的关键词执行
    search_papers_by_relevance
    进行横向探索
  4. 展示前先按paperId去重

Pattern 3 — Author Deep-Dive

模式3 — 作者深度挖掘

  1. search_authors_by_name(name)
    → pick correct profile (disambiguate by affiliation)
  2. get_author_papers(authorId)
    → full publication list
  3. Filter client-side by topic keywords or date
  1. search_authors_by_name(name)
    → 选择正确的作者资料(通过所属机构区分)
  2. get_author_papers(authorId)
    → 获取完整的出版物列表
  3. 客户端侧按主题关键词或日期筛选结果

Pattern 4 — Evidence Retrieval

模式4 — 证据检索

  1. snippet_search(claim_query)
    → find passages making/supporting a claim
  2. For each hit, optionally
    get_paper(id)
    for full metadata
  1. snippet_search(claim_query)
    → 查找提出/支持某观点的段落
  2. 对于每个结果,可选择执行
    get_paper(id)
    获取完整元数据

Output & Interaction Rules

输出与交互规则

  • Always report total count and which tool was used.
  • Present top 10 as a table (title, year, venue, citations), then details for the most relevant.
  • If the user writes in Chinese, present summaries in Chinese; keep titles in original language.
  • After results, offer: Details / Refine / Citations / Snippet / Export / Done.
  • 始终报告结果总数所使用的工具
  • 将前10条结果以表格形式展示(标题、年份、会议/期刊、引用量),然后展示最相关结果的详细信息。
  • 如果用户使用中文提问,需用中文展示摘要;论文标题保留原语言。
  • 展示结果后,提供选项:详情/筛选/引用/片段/导出/完成

Critical Rules

关键规则

  • Prefer batched intent over ping-pong. If the user's question needs two independent lookups, issue them as parallel MCP tool calls in one turn, not sequentially.
  • Never guess IDs. If a user gives a fuzzy title, use
    search_paper_by_title
    before
    get_paper
    .
  • Respect rate limits. An API key buys higher limits but not unlimited — stop expanding citation graphs beyond what the user asked for.
  • Do not fabricate fields. If Asta returns null
    abstract
    or
    venue
    , say so rather than inventing.
  • 优先批量处理,避免频繁交互。如果用户的问题需要两次独立查询,请在一轮中发起并行MCP工具调用,而非依次调用。
  • 切勿猜测ID。如果用户提供模糊的标题,请先使用
    search_paper_by_title
    ,再调用
    get_paper
  • 遵守速率限制。API密钥可提升速率限制,但并非无限制——不要超出用户需求扩展引用图谱。
  • 切勿编造字段。如果Asta返回的
    abstract
    venue
    为null,请如实告知,不要编造内容。

Handling Asta responses

处理Asta响应

SituationWhat to do
Empty
abstract
Not all corpus papers have full text — use
snippet_search
, or fall back to title + TLDR
Author disambiguation uncertainInspect affiliations in
search_authors_by_name
results before calling
get_author_papers
429 Too Many Requests
Back off; batch with
get_paper_batch
instead of sequential
get_paper
calls
Need DOI / PubMed ID / arXiv IDAdd
externalIds
to
fields
(see "Retrieving DOI" above); fall back to
ArXiv
ID when
DOI
is absent
场景处理方式
abstract
为空
并非所有语料库中的论文都有完整文本——可使用
snippet_search
,或退而展示标题 + TLDR
作者身份区分存疑在调用
get_author_papers
前,检查
search_authors_by_name
结果中的所属机构信息
返回
429 Too Many Requests
暂停请求;改用
get_paper_batch
批量查询,而非连续调用
get_paper
需要DOI / PubMed ID / arXiv ID
fields
中添加
externalIds
(见上方“获取DOI”部分);若DOI不存在,退而使用
ArXiv
ID