gget

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

gget

gget

Use this skill when a task needs quick bioinformatics lookup across genomic reference databases with the
gget
CLI or Python package.
当需要通过
gget
CLI或Python包快速跨基因组参考数据库进行生物信息学查询时,可使用此技能。

When to Use

使用场景

  • Finding Ensembl IDs, gene metadata, transcript details, or sequences.
  • Running quick BLAST or BLAT lookups without building a full local pipeline.
  • Fetching reference genome links and annotations from Ensembl.
  • Querying protein structure, pathway, cancer, expression, or disease-association modules through a single interface.
  • Creating a reproducible first-pass evidence log before moving to heavier tools such as Biopython, Snakemake, Nextflow, BLAST+, or database-specific clients.
Use a dedicated workflow instead of
gget
when the task requires regulated clinical interpretation, high-throughput production pipelines, or fine-grained control over database versions and local indexes.
  • 查找Ensembl ID、基因元数据、转录本详情或序列。
  • 无需构建完整本地流水线即可运行快速BLAST或BLAT查询。
  • 从Ensembl获取参考基因组链接及注释信息。
  • 通过单一界面查询蛋白质结构、通路、癌症、表达或疾病关联模块。
  • 在转向更重量级工具(如Biopython、Snakemake、Nextflow、BLAST+或数据库专属客户端)之前,创建可复现的初步证据日志。
当任务需要合规的临床解读、高通量生产流水线,或对数据库版本和本地索引进行精细化控制时,请使用专用工作流而非
gget

Installation

安装

Use a clean Python environment.
bash
python -m venv .venv
. .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install --upgrade gget
gget --help
If
uv
is available:
bash
uv venv
. .venv/bin/activate
uv pip install gget
Before relying on an older environment, upgrade
gget
and re-check the module docs. The upstream databases queried by
gget
change over time.
使用干净的Python环境。
bash
python -m venv .venv
. .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install --upgrade gget
gget --help
uv
可用:
bash
uv venv
. .venv/bin/activate
uv pip install gget
在依赖旧环境之前,请升级
gget
并重新查看模块文档。
gget
查询的上游数据库会随时间变化。

Basic Patterns

基本模式

CLI shape:
bash
gget <module> [arguments] [options]
Python shape:
python
import gget

result = gget.search(["BRCA1"], species="human")
print(result)
Common workflow:
  1. Identify the species, assembly, gene ID type, and database needed.
  2. Check the current module documentation for arguments.
  3. Run a small query first.
  4. Save output with an explicit filename and date.
  5. Record module name, version, arguments, and database assumptions.
CLI格式:
bash
gget <module> [arguments] [options]
Python格式:
python
import gget

result = gget.search(["BRCA1"], species="human")
print(result)
通用工作流:
  1. 确定所需的物种、组装版本、基因ID类型及数据库。
  2. 查看当前模块文档以了解参数信息。
  3. 先运行一个小型查询。
  4. 使用明确的文件名和日期保存输出。
  5. 记录模块名称、版本、参数及数据库假设。

Common Modules

常用模块

Use current upstream docs for exact arguments. These modules are common first choices:
  • gget search
    : find Ensembl IDs from search terms.
  • gget info
    : retrieve metadata for Ensembl, UniProt, or related IDs.
  • gget seq
    : fetch nucleotide or amino-acid sequences.
  • gget ref
    : retrieve reference genome download links.
  • gget blast
    : run a quick BLAST query.
  • gget blat
    : locate a sequence against supported genome assemblies.
  • gget muscle
    : run multiple sequence alignment.
  • gget diamond
    : run local sequence alignment against reference sequences.
  • gget alphafold
    and
    gget pdb
    : inspect protein-structure references.
  • gget enrichr
    ,
    gget opentargets
    ,
    gget archs4
    ,
    gget bgee
    ,
    gget cbio
    , and
    gget cosmic
    : explore enrichment, target, expression, cancer, and disease association data.
Do not assume every module supports every Python version or dependency set. Some optional scientific dependencies have narrower version support than the core package.
请查阅最新上游文档获取确切参数。以下是常用的首选模块:
  • gget search
    :通过搜索词查找Ensembl ID。
  • gget info
    :获取Ensembl、UniProt或相关ID的元数据。
  • gget seq
    :获取核苷酸或氨基酸序列。
  • gget ref
    :获取参考基因组下载链接。
  • gget blast
    :运行快速BLAST查询。
  • gget blat
    :在支持的基因组组装版本中定位序列。
  • gget muscle
    :运行多序列比对。
  • gget diamond
    :针对参考序列运行本地序列比对。
  • gget alphafold
    gget pdb
    :查看蛋白质结构参考信息。
  • gget enrichr
    gget opentargets
    gget archs4
    gget bgee
    gget cbio
    gget cosmic
    :探索富集分析、靶点、表达、癌症及疾病关联数据。
请勿假设每个模块都支持所有Python版本或依赖集。部分可选科学依赖的版本支持范围比核心包更窄。

Quick Examples

快速示例

Find genes:
bash
gget search -s human brca1 dna repair -o brca1-search.json
Fetch gene metadata:
bash
gget info ENSG00000012048 -o brca1-info.json
Fetch a sequence:
bash
gget seq ENSG00000012048 -o brca1-seq.fa
Run a small BLAST query:
bash
gget blast "MEEPQSDPSVEPPLSQETFSDLWKLLPEN" -l 10 -o blast-results.json
Python example:
python
import gget

genes = gget.search(["BRCA1", "DNA repair"], species="human")
info = gget.info(["ENSG00000012048"])
sequence = gget.seq("ENSG00000012048")
查找基因:
bash
gget search -s human brca1 dna repair -o brca1-search.json
获取基因元数据:
bash
gget info ENSG00000012048 -o brca1-info.json
获取序列:
bash
gget seq ENSG00000012048 -o brca1-seq.fa
运行小型BLAST查询:
bash
gget blast "MEEPQSDPSVEPPLSQETFSDLWKLLPEN" -l 10 -o blast-results.json
Python示例:
python
import gget

genes = gget.search(["BRCA1", "DNA repair"], species="human")
info = gget.info(["ENSG00000012048"])
sequence = gget.seq("ENSG00000012048")

Reproducibility Log

可复现性日志

For scientific outputs, include enough metadata to replay the query.
markdown
| Date | gget version | Module | Query | Species/assembly | Output | Notes |
| --- | --- | --- | --- | --- | --- | --- |
| 2026-05-11 | `gget --version` | search | `BRCA1 DNA repair` | human | `brca1-search.json` | Docs checked before run |
Also record:
  • Python version and environment manager.
  • Any optional dependency installed through
    gget setup
    .
  • Database-specific identifiers returned by the query.
  • Whether output is JSON, CSV, FASTA, or a DataFrame export.
  • Any failures that were resolved by upgrading
    gget
    .
对于科学输出,请包含足够的元数据以重现查询。
markdown
| 日期 | gget版本 | 模块 | 查询内容 | 物种/组装版本 | 输出 | 备注 |
| --- | --- | --- | --- | --- | --- | --- |
| 2026-05-11 | `gget --version` | search | `BRCA1 DNA repair` | human | `brca1-search.json` | 运行前已查阅文档 |
还需记录:
  • Python版本及环境管理器。
  • 通过
    gget setup
    安装的任何可选依赖。
  • 查询返回的数据库专属标识符。
  • 输出格式是JSON、CSV、FASTA还是DataFrame导出。
  • 任何通过升级
    gget
    解决的失败问题。

Review Checklist

检查清单

  • Did you upgrade or verify the installed
    gget
    version?
  • Did you check the current upstream module docs before using arguments?
  • Is the species or assembly explicit?
  • Are identifiers preserved exactly, including Ensembl/UniProt prefixes?
  • Is the result labeled as database output rather than clinical interpretation?
  • Is the query reproducible from the saved command or Python snippet?
  • Are optional dependencies installed in an isolated environment?
  • 你是否升级或验证了已安装的
    gget
    版本?
  • 使用参数前是否查阅了当前上游模块文档?
  • 是否明确指定了物种或组装版本?
  • 是否精确保留了标识符(包括Ensembl/UniProt前缀)?
  • 是否将结果标记为数据库输出而非临床解读?
  • 能否从保存的命令或Python代码片段重现查询?
  • 是否在隔离环境中安装了可选依赖?

References

参考资料