gs-export

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Google Scholar Export to Zotero

将Google Scholar论文导出至Zotero

Export Google Scholar paper citation data via BibTeX extraction and push to Zotero desktop.

通过提取BibTeX格式的引用数据，将Google Scholar论文的引用信息推送至Zotero桌面端。

Arguments

参数说明

$ARGUMENTS contains one or more data-cids (space-separated), e.g.:

```
TFS2GgoGiNUJ
```
— single paper
```
TFS2GgoGiNUJ abc123XYZ def456UVW
```
— batch export

$ARGUMENTS包含一个或多个data-cid（以空格分隔），例如：

```
TFS2GgoGiNUJ
```
— 单篇论文
```
TFS2GgoGiNUJ abc123XYZ def456UVW
```
— 批量导出

Steps

操作步骤

Step 1: Get BibTeX for each paper

步骤1：获取每篇论文的BibTeX数据

For each data-cid, perform 3 tool calls to bypass CORS:

针对每个data-cid，执行3次工具调用以绕过CORS限制：

1a. Fetch cite dialog to get BibTeX link (evaluate_script)

1a. 获取引用对话框中的BibTeX链接（evaluate_script）

javascript

async () => {
  const cid = "DATA_CID_HERE";
  const resp = await fetch(
    `https://scholar.google.com/scholar?q=info:${cid}:scholar.google.com/&output=cite`,
    { credentials: 'include' }
  );
  const html = await resp.text();
  const doc = new DOMParser().parseFromString(html, 'text/html');

  // Extract export links
  const links = Array.from(doc.querySelectorAll('#gs_citi a')).map(a => ({
    format: a.textContent.trim(),
    url: a.href
  }));

  // Extract citation format texts
  const citations = Array.from(doc.querySelectorAll('#gs_citt tr')).map(tr => {
    const cells = tr.querySelectorAll('td');
    return {
      style: cells[0]?.textContent?.trim() || '',
      text: cells[1]?.textContent?.trim() || ''
    };
  });

  const bibtexLink = links.find(l => l.format === 'BibTeX');
  return { cid, bibtexLink: bibtexLink?.url || '', links, citations };
}

javascript

async () => {
  const cid = "DATA_CID_HERE";
  const resp = await fetch(
    `https://scholar.google.com/scholar?q=info:${cid}:scholar.google.com/&output=cite`,
    { credentials: 'include' }
  );
  const html = await resp.text();
  const doc = new DOMParser().parseFromString(html, 'text/html');

  // 提取导出链接
  const links = Array.from(doc.querySelectorAll('#gs_citi a')).map(a => ({
    format: a.textContent.trim(),
    url: a.href
  }));

  // 提取引用格式文本
  const citations = Array.from(doc.querySelectorAll('#gs_citt tr')).map(tr => {
    const cells = tr.querySelectorAll('td');
    return {
      style: cells[0]?.textContent?.trim() || '',
      text: cells[1]?.textContent?.trim() || ''
    };
  });

  const bibtexLink = links.find(l => l.format === 'BibTeX');
  return { cid, bibtexLink: bibtexLink?.url || '', links, citations };
}

1b. Navigate to BibTeX URL (navigate_page)

1b. 跳转到BibTeX链接页面（navigate_page）

Use

mcp__chrome-devtools__navigate_page

url: the
```
bibtexLink
```
URL from step 1a (on
```
scholar.googleusercontent.com
```
)

This bypasses CORS restrictions that block fetch() to googleusercontent.com.

使用

mcp__chrome-devtools__navigate_page

工具：

url：步骤1a中获取的
```
bibtexLink
```
链接（位于
```
scholar.googleusercontent.com
```
域名下）

此操作可绕过阻止向googleusercontent.com发起fetch()请求的CORS限制。

1c. Read BibTeX content (evaluate_script)

1c. 读取BibTeX内容（evaluate_script）

javascript

async () => {
  return { bibtex: document.body.innerText || document.body.textContent || '' };
}

javascript

async () => {
  return { bibtex: document.body.innerText || document.body.textContent || '' };
}

Step 2: Parse BibTeX and push to Zotero

步骤2：解析BibTeX并推送至Zotero

Save the BibTeX data as JSON, then call the push script:

bash

python "E:/gscholar-skills/.claude/skills/gs-export/scripts/push_to_zotero.py" /tmp/gs_papers.json

Before calling the script, construct a JSON file at

/tmp/gs_papers.json

containing paper data parsed from BibTeX. Parse the BibTeX yourself and create the JSON array:

json

[
  {
    "pmid": "",
    "title": "The title from BibTeX",
    "authors": [
      {"lastName": "Smith", "firstName": "John"}
    ],
    "journal": "Journal Name",
    "journalAbbr": "",
    "pubdate": "2022",
    "volume": "14",
    "issue": "4",
    "pages": "1054",
    "doi": "",
    "pdfUrl": "https://example.com/paper.pdf",
    "abstract": "",
    "keywords": [],
    "language": "en",
    "pubtype": ["Journal Article"]
  }
]

IMPORTANT: Set

pdfUrl

from the search result's

fullTextUrl

field (the PDF link extracted by gs-search). The Python script will download the PDF and upload it to Zotero via

/connector/saveAttachment

(Zotero 7.x ignores attachments in saveItems). PDF download may fail for some publishers (403, JS-redirect); these are reported as "PDF skip".

BibTeX fields mapping:

```
@article{key,
```
→
```
itemType: journalArticle
```

@inproceedings{key,

→

itemType: conferencePaper

```
@book{key,
```
→
```
itemType: book
```
```
title={...}
```
→
```
title
```

author={Last1, First1 and Last2, First2}

→

authors

array

```
journal={...}
```
→
```
journal
```
```
year={...}
```
→
```
pubdate
```
```
volume={...}
```
→
```
volume
```
```
number={...}
```
→
```
issue
```
```
pages={...}
```
→
```
pages
```
```
publisher={...}
```
→ (included in extra or publisher field)

将BibTeX数据保存为JSON格式，然后调用推送脚本：

bash

python "E:/gscholar-skills/.claude/skills/gs-export/scripts/push_to_zotero.py" /tmp/gs_papers.json

调用脚本前，需在

/tmp/gs_papers.json

路径下创建包含从BibTeX解析出的论文数据的JSON文件。自行解析BibTeX并构建JSON数组：

json

[
  {
    "pmid": "",
    "title": "The title from BibTeX",
    "authors": [
      {"lastName": "Smith", "firstName": "John"}
    ],
    "journal": "Journal Name",
    "journalAbbr": "",
    "pubdate": "2022",
    "volume": "14",
    "issue": "4",
    "pages": "1054",
    "doi": "",
    "pdfUrl": "https://example.com/paper.pdf",
    "abstract": "",
    "keywords": [],
    "language": "en",
    "pubtype": ["Journal Article"]
  }
]

重要提示：从搜索结果的

fullTextUrl

字段（由gs-search提取的PDF链接）设置

pdfUrl

。Python脚本将下载PDF并通过

/connector/saveAttachment

上传至Zotero（Zotero 7.x会忽略saveItems中的附件）。部分出版商可能导致PDF下载失败（403错误、JS重定向），此类情况会被标记为“PDF跳过”。

BibTeX字段映射关系：

```
@article{key,
```
→
```
itemType: journalArticle
```

@inproceedings{key,

→

itemType: conferencePaper

```
@book{key,
```
→
```
itemType: book
```
```
title={...}
```
→
```
title
```

author={Last1, First1 and Last2, First2}

→

authors

数组

```
journal={...}
```
→
```
journal
```
```
year={...}
```
→
```
pubdate
```
```
volume={...}
```
→
```
volume
```
```
number={...}
```
→
```
issue
```
```
pages={...}
```
→
```
pages
```
```
publisher={...}
```
→（包含在extra字段或publisher字段中）

Step 3: Report

步骤3：导出报告

Single paper:

Exported to Zotero from Google Scholar:
  Title: {title}
  Authors: {authors}
  Journal: {journal} ({year})
  Data-CID: {dataCid}

Batch:

Exported {count} papers to Zotero from Google Scholar:
  1. {title1} ({journal1}, {year1})
  2. {title2} ({journal2}, {year2})
  ...

单篇论文导出报告：

已从Google Scholar导出至Zotero：
  标题：{title}
  作者：{authors}
  期刊：{journal} ({year})
  Data-CID：{dataCid}

批量导出报告：

已从Google Scholar导出{count}篇论文至Zotero：
  1. {title1} ({journal1}, {year1})
  2. {title2} ({journal2}, {year2})
  ...

Batch Export Optimization

批量导出优化方案

For multiple papers, process sequentially to avoid CAPTCHA:

Get all BibTeX links in one evaluate_script call (fetch all cite dialogs)
Navigate to each BibTeX URL one at a time
Collect all BibTeX entries
Push all to Zotero in a single batch

针对多篇论文，按顺序处理以避免触发CAPTCHA：

通过一次evaluate_script调用获取所有BibTeX链接（获取所有引用对话框）
依次跳转到每个BibTeX链接页面
收集所有BibTeX条目
一次性批量推送至Zotero

Notes

注意事项

Single paper export uses 3-4 tool calls:
```
evaluate_script
```
(cite dialog) +
```
navigate_page
```
(BibTeX URL) +
```
evaluate_script
```
(read BibTeX) +
```
bash python
```
(Zotero push)
Batch export: 2N+1 tool calls (N papers: N navigate + N evaluate + 1 bash)
BibTeX links are on
```
scholar.googleusercontent.com
```
— CORS blocks fetch(), so we use navigate_page to bypass
Reuses
```
push_to_zotero.py
```
for Zotero Connector API communication
Google Scholar BibTeX does NOT include abstract or DOI — these fields will be empty in Zotero
After export, navigate back to Google Scholar page:
```
navigate_page
```
with type
```
back
```

单篇论文导出需使用3-4次工具调用：
```
evaluate_script
```
（获取引用对话框）+
```
navigate_page
```
（跳转至BibTeX链接）+
```
evaluate_script
```
（读取BibTeX内容）+
```
bash python
```
（推送至Zotero）
批量导出：需使用2N+1次工具调用（N篇论文：N次跳转 + N次读取 + 1次bash调用）
BibTeX链接位于
```
scholar.googleusercontent.com
```
域名下——CORS会阻止fetch()请求，因此我们使用navigate_page来绕过限制
复用
```
push_to_zotero.py
```
脚本与Zotero Connector API进行通信
Google Scholar的BibTeX不包含摘要或DOI信息——这些字段在Zotero中会为空
导出完成后，跳转回Google Scholar页面：调用
```
navigate_page
```
并设置类型为
```
back
```