tooluniverse-structural-proteomics

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Structural Proteomics for Drug Target Validation

用于药物靶点验证的结构蛋白质组学

Comprehensive structural data integration using ToolUniverse tools across PDB, AlphaFold, GPCRdb, SAbDab, and proteomics databases for drug target validation.
通过ToolUniverse工具整合PDB、AlphaFold、GPCRdb、SAbDab及蛋白质组学数据库的全面结构数据,以实现药物靶点验证。

LOOK UP DON'T GUESS

有据可依,勿凭空猜测

  • PDB structures/resolutions:
    PDBeSIFTS_get_best_structures
    and
    RCSBGraphQL_get_structure_summary
  • AlphaFold confidence:
    alphafold_get_summary
  • Ligands/affinities:
    PDBe_get_structure_ligands
    and
    BindingDB_get_ligands_by_uniprot
  • Druggability:
    ProteinsPlus_predict_binding_sites
  • PDB结构/分辨率:
    PDBeSIFTS_get_best_structures
    RCSBGraphQL_get_structure_summary
  • AlphaFold置信度:
    alphafold_get_summary
  • 配体/亲和力:
    PDBe_get_structure_ligands
    BindingDB_get_ligands_by_uniprot
  • 成药性:
    ProteinsPlus_predict_binding_sites

COMPUTE, DON'T DESCRIBE

执行计算,勿仅描述

When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
当分析需要计算(统计、数据处理、评分、富集分析)时,通过Bash编写并运行Python代码。不要描述你会怎么做——直接执行并报告实际结果。使用ToolUniverse工具获取数据,再通过Python(pandas、scipy、statsmodels、matplotlib)进行分析。

Domain Reasoning

领域推理

Resolution determines valid conclusions: <2A = atom positions visible; 2-3A = side chains reliable, drug design supported; >3A = backbone only, binding site unreliable. Do not over-interpret low-resolution structures.

分辨率决定结论的有效性:<2Å = 可观测原子位置;2-3Å = 侧链可靠,支持药物设计;>3Å = 仅可见主链,结合位点不可靠。请勿过度解读低分辨率结构。

Tool Inventory

工具清单

PDB (RCSB)

PDB (RCSB)

RCSBAdvSearch_search_structures
(query_type, query_value, rows),
RCSBData_get_entry
(entry_id),
RCSBGraphQL_get_structure_summary
(pdb_id),
RCSBGraphQL_get_ligand_info
(pdb_id),
RCSB_get_chemical_component
(comp_id)
RCSBAdvSearch_search_structures
(query_type, query_value, rows),
RCSBData_get_entry
(entry_id),
RCSBGraphQL_get_structure_summary
(pdb_id),
RCSBGraphQL_get_ligand_info
(pdb_id),
RCSB_get_chemical_component
(comp_id)

PDB (PDBe)

PDB (PDBe)

pdbe_get_entry_summary
(pdb_id),
PDBe_get_structure_ligands
(pdb_id),
PDBe_get_bound_molecules
(pdb_id),
PDBeSearch_search_structures
(query, rows),
PDBeSIFTS_get_best_structures
(uniprot_id),
PDBeSIFTS_get_all_structures
(uniprot_id),
PDBe_KB_get_ligand_sites
(pdb_id),
PDBe_KB_get_interface_residues
(pdb_id),
PDBeValidation_get_quality_scores
(pdb_id)
pdbe_get_entry_summary
(pdb_id),
PDBe_get_structure_ligands
(pdb_id),
PDBe_get_bound_molecules
(pdb_id),
PDBeSearch_search_structures
(query, rows),
PDBeSIFTS_get_best_structures
(uniprot_id),
PDBeSIFTS_get_all_structures
(uniprot_id),
PDBe_KB_get_ligand_sites
(pdb_id),
PDBe_KB_get_interface_residues
(pdb_id),
PDBeValidation_get_quality_scores
(pdb_id)

PDBe PISA

PDBe PISA

PDBePISA_get_interfaces
(pdb_id),
PDBePISA_get_assemblies
(pdb_id)
PDBePISA_get_interfaces
(pdb_id),
PDBePISA_get_assemblies
(pdb_id)

AlphaFold

AlphaFold

alphafold_get_prediction
(qualifier=UniProt),
alphafold_get_summary
(qualifier),
alphafold_get_annotations
(qualifier)
alphafold_get_prediction
(qualifier=UniProt),
alphafold_get_summary
(qualifier),
alphafold_get_annotations
(qualifier)

Binding Sites

结合位点

ProteinsPlus_predict_binding_sites
(pdb_id, chain),
BindingDB_get_ligands_by_uniprot
(uniprot_id),
BindingDB_get_ligands_by_pdb
(pdb_id),
BindingDB_get_targets_by_compound
(smiles)
ProteinsPlus_predict_binding_sites
(pdb_id, chain),
BindingDB_get_ligands_by_uniprot
(uniprot_id),
BindingDB_get_ligands_by_pdb
(pdb_id),
BindingDB_get_targets_by_compound
(smiles)

Foldseek

Foldseek

Foldseek_search_structure
(sequence, mode="tmalign"),
Foldseek_get_result
(ticket)
Foldseek_search_structure
(sequence, mode="tmalign"),
Foldseek_get_result
(ticket)

GPCRdb

GPCRdb

GPCRdb_get_protein
(protein),
GPCRdb_get_structures
(protein),
GPCRdb_get_ligands
(protein),
GPCRdb_get_mutations
(protein). Accepts entry names, gene symbols (auto-converted to
{symbol.lower()}_human
), or UniProt accessions.
GPCRdb_get_protein
(protein),
GPCRdb_get_structures
(protein),
GPCRdb_get_ligands
(protein),
GPCRdb_get_mutations
(protein)。接受条目名称、基因符号(自动转换为
{symbol.lower()}_human
)或UniProt accession号。

SAbDab

SAbDab

SAbDab_search_structures
(query/antigen),
SAbDab_get_structure
(pdb_id),
TheraSAbDab_search_therapeutics
(query),
TheraSAbDab_search_by_target
(target)
SAbDab_search_structures
(query/antigen),
SAbDab_get_structure
(pdb_id),
TheraSAbDab_search_therapeutics
(query),
TheraSAbDab_search_by_target
(target)

Domains

结构域

InterPro_get_protein_domains
(uniprot_id),
Pfam_get_protein_annotations
(uniprot_id),
UniProt_get_entry_by_accession
(accession)
InterPro_get_protein_domains
(uniprot_id),
Pfam_get_protein_annotations
(uniprot_id),
UniProt_get_entry_by_accession
(accession)

Proteomics

蛋白质组学

ProteomeXchange_search_datasets
(query),
ProteomeXchange_get_dataset
(dataset_id)

ProteomeXchange_search_datasets
(query),
ProteomeXchange_get_dataset
(dataset_id)

Workflow 1: Find All Structures for a Drug Target

工作流1:查找药物靶点的所有结构

Phase 0: Resolve protein → UniProt ID, gene symbol, organism
Phase 1: PDBeSIFTS_get_best_structures → RCSBGraphQL_get_structure_summary → PDBeValidation
Phase 2: alphafold_get_prediction/summary → compare pLDDT with experimental coverage
Phase 3: IF GPCR → GPCRdb; IF antibody target → SAbDab/TheraSAbDab
Phase 4: InterPro/Pfam domain mapping → identify unresolved regions
Phase 5: Summary table (PDB ID, method, resolution, ligands, coverage, quality)
Decisions: Resolution <2.5A for drug design. X-ray > Cryo-EM > NMR > AlphaFold for binding sites. Holo > apo structures.
Phase 0: 解析蛋白质 → UniProt ID、基因符号、物种
Phase 1: PDBeSIFTS_get_best_structures → RCSBGraphQL_get_structure_summary → PDBeValidation
Phase 2: alphafold_get_prediction/summary → 对比pLDDT与实验覆盖范围
Phase 3: 若为GPCR → GPCRdb;若为抗体靶点 → SAbDab/TheraSAbDab
Phase 4: InterPro/Pfam结构域映射 → 识别未解析区域
Phase 5: 汇总表格(PDB ID、方法、分辨率、配体、覆盖范围、质量)
决策依据:药物设计需分辨率<2.5Å。结合位点优先级:X射线 > 冷冻电镜 > NMR > AlphaFold。结合态结构 > apo结构。

Workflow 2: Identify Binding Pocket Ligands

工作流2:识别结合口袋配体

Phase 1: PDBe_get_structure_ligands + RCSBGraphQL_get_ligand_info + PDBe_KB_get_ligand_sites
Phase 2: ProteinsPlus_predict_binding_sites → druggability score, pocket residues
Phase 3: BindingDB_get_ligands_by_pdb/uniprot → Ki, Kd, IC50
Phase 4: RCSB_get_chemical_component for key ligands
Filter artifacts: GOL, EDO, SO4, PEG, ACT, CL, NA. Keep cofactors (ATP, NAD, HEM) and catalytic metals (ZN, MG) if relevant.
Phase 1: PDBe_get_structure_ligands + RCSBGraphQL_get_ligand_info + PDBe_KB_get_ligand_sites
Phase 2: ProteinsPlus_predict_binding_sites → 成药性评分、口袋残基
Phase 3: BindingDB_get_ligands_by_pdb/uniprot → Ki、Kd、IC50
Phase 4: RCSB_get_chemical_component 获取关键配体信息
过滤人工产物:过滤GOL、EDO、SO4、PEG、ACT、CL、NA。若相关则保留辅因子(ATP、NAD、HEM)和催化金属(ZN、MG)。

Workflow 3: Cross-Validate Drug Binding

工作流3:交叉验证药物结合

Phase 1: Find co-crystal structures → filter for drug/analogs
Phase 2: BindingDB affinity data (Ki, Kd, IC50)
Phase 3: ProteinsPlus + PDBe-KB binding site characterization
Phase 4: PDBeValidation quality → binding site well-resolved?
Phase 5: AlphaFold + Foldseek structural comparison
Phase 6: GPCR-specific (if applicable) → active/inactive states, pharmacology, resistance mutations
Phase 7: Antibody-specific (if applicable) → epitope mapping
Phase 8: Evidence integration

Phase 1: 查找共晶结构 → 筛选药物/类似物
Phase 2: BindingDB亲和力数据(Ki、Kd、IC50)
Phase 3: ProteinsPlus + PDBe-KB结合位点特征分析
Phase 4: PDBeValidation质量评估 → 结合位点是否解析良好?
Phase 5: AlphaFold + Foldseek结构对比
Phase 6: 若为GPCR特异性 → 活性/非活性状态、药理学、耐药突变
Phase 7: 若为抗体特异性 → 表位定位
Phase 8: 整合证据

Tool Parameter Gotchas

工具参数注意事项

ToolMistakeCorrect
alphafold_get_prediction/summary
uniprot_id
qualifier
GPCRdb_get_protein
gene_name
protein
PDBeSIFTS_get_best_structures
gene symbol
uniprot_id
(e.g., "P04637")
Foldseek_search_structure
mode="3diaa"
mode="tmalign"
SAbDab_search_structures
name
query
or
antigen
RCSB_get_chemical_component
ligand_id
comp_id

工具常见错误正确用法
alphafold_get_prediction/summary
使用
uniprot_id
使用
qualifier
GPCRdb_get_protein
使用
gene_name
使用
protein
PDBeSIFTS_get_best_structures
使用基因符号使用
uniprot_id
(例如:"P04637")
Foldseek_search_structure
使用
mode="3diaa"
使用
mode="tmalign"
SAbDab_search_structures
使用
name
使用
query
antigen
RCSB_get_chemical_component
使用
ligand_id
使用
comp_id

Evidence Grading

证据分级

TierConfidence
T1Co-crystal (<2.5A) + binding affinity data
T2Experimental structure + computational prediction
T3AlphaFold + pocket analysis + known ligand analogs
T4Homology model or low-resolution only
等级置信度
T1共晶结构(<2.5Å)+ 结合亲和力数据
T2实验结构 + 计算预测
T3AlphaFold + 口袋分析 + 已知配体类似物
T4仅同源模型或低分辨率结构

Interpretation

解读标准

MetricHighAcceptableCaution
Resolution<2.0A (X-ray) / <3.0A (cryo-EM)2.0-2.5A / 3.0-4.0A>3.0A / >4.5A
R-free<0.250.25-0.30>0.30
AlphaFold pLDDT>9070-90<70 (disordered)
DoGSiteScorer >0.6 = druggable; <0.4 = unlikely druggable. PISA assemblies should be cross-validated with SEC-MALS/native MS.
指标优秀可接受需谨慎
分辨率<2.0Å(X射线)/ <3.0Å(冷冻电镜)2.0-2.5Å / 3.0-4.0Å>3.0Å / >4.5Å
R-free<0.250.25-0.30>0.30
AlphaFold pLDDT>9070-90<70(无序区域)
DoGSiteScorer得分>0.6 = 具有成药性;<0.4 = 成药性低。PISA组装体需通过SEC-MALS/天然质谱交叉验证。

Limitations

局限性

  • BindingDB: 60s+ for popular targets
  • AlphaFold: lacks ligand context
  • GPCRdb: Class A-F GPCRs only
  • PDBePISA:
    operation
    is internal, not a public parameter
  • BindingDB:热门靶点查询需60秒以上
  • AlphaFold:缺乏配体相关上下文
  • GPCRdb:仅支持A-F类GPCR
  • PDBePISA:
    operation
    为内部参数,非公开参数