tooluniverse-spatial-omics-analysis

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Spatial Multi-Omics Analysis Pipeline

空间多组学分析流程

Comprehensive biological interpretation of spatial omics data. Transforms spatially variable genes (SVGs), domain annotations, and tissue context into actionable biological insights covering pathway enrichment, cell-cell interactions, druggable targets, immune microenvironment, and multi-modal integration.
KEY PRINCIPLES:
  1. Report-first approach - Create report file FIRST, then populate progressively
  2. Domain-by-domain analysis - Characterize each spatial region independently before comparison
  3. Gene-list-centric - Analyze user-provided SVGs and marker genes with ToolUniverse databases
  4. Biological interpretation - Go beyond statistics to explain biological meaning of spatial patterns
  5. Disease focus - Emphasize disease mechanisms and therapeutic opportunities when disease context is provided
  6. Evidence grading - Grade all evidence as T1 (human/clinical) to T4 (computational)
  7. Multi-modal thinking - Integrate RNA, protein, and metabolite information when available
  8. Validation guidance - Suggest experimental validation approaches for key findings
  9. Source references - Every statement must cite tool/database source
  10. Completeness checklist - Mandatory section showing analysis coverage
  11. English-first queries - Always use English terms in tool calls. Respond in user's language

对空间多组学数据进行全面的生物学解读。将空间可变基因(SVG)、区域注释和组织背景转化为可落地的生物学洞见,涵盖通路富集、细胞间互作、可成药靶点、免疫微环境以及多模态整合等内容。
核心原则:
  1. 报告优先方法 - 先创建报告文件,再逐步填充内容
  2. 逐区域分析 - 在进行区域间比较前,先独立表征每个空间区域
  3. 基因列表为中心 - 结合ToolUniverse数据库分析用户提供的SVG和标记基因
  4. 生物学解读 - 超越统计层面,解释空间模式的生物学意义
  5. 聚焦疾病 - 当提供疾病背景时,重点关注疾病机制和治疗机会
  6. 证据分级 - 将所有证据分为T1(人类/临床)至T4(计算预测)四个等级
  7. 多模态思维 - 若有可用数据,整合RNA、蛋白质和代谢物信息
  8. 验证指导 - 为关键发现提供实验验证方法建议
  9. 来源引用 - 所有结论必须标注工具/数据库来源
  10. 完整性检查清单 - 必须包含显示分析覆盖范围的章节
  11. 英文优先查询 - 在工具调用中始终使用英文术语,以用户使用的语言回复

When to Use This Skill

何时使用该技能

Apply when users:
  • Provide spatially variable genes from spatial transcriptomics experiments
  • Ask about biological interpretation of spatial domains/clusters
  • Need pathway enrichment analysis of spatial gene expression data
  • Want to understand cell-cell interactions from spatial data
  • Ask about tumor microenvironment heterogeneity from spatial omics
  • Need druggable targets in specific spatial regions
  • Ask about tissue zonation patterns (liver, brain, kidney)
  • Want to integrate spatial transcriptomics + proteomics data
  • Ask about immune infiltration patterns from spatial data
  • Need to compare healthy vs disease regions spatially
  • Ask "What pathways are enriched in this tumor core vs tumor margin?"
  • Ask "What cell-cell interactions occur in this spatial domain?"
NOT for (use other skills instead):
  • Single gene interpretation without spatial context -> Use
    tooluniverse-target-research
  • Variant interpretation -> Use
    tooluniverse-variant-interpretation
  • Drug safety profiling -> Use
    tooluniverse-adverse-event-detection
  • Disease-only analysis without spatial data -> Use
    tooluniverse-multiomic-disease-characterization
  • GWAS analysis -> Use
    tooluniverse-gwas-*
    skills
  • Bulk RNA-seq (non-spatial) -> Use
    tooluniverse-systems-biology

当用户出现以下情况时适用:
  • 提供空间转录组实验得到的空间可变基因
  • 询问空间区域/聚类的生物学解读
  • 需要对空间基因表达数据进行通路富集分析
  • 希望从空间数据中理解细胞间互作
  • 询问空间多组学中的肿瘤微环境异质性
  • 需要识别特定空间区域中的可成药靶点
  • 询问组织分区模式(肝脏、大脑、肾脏)
  • 希望整合空间转录组与蛋白质组数据
  • 询问空间数据中的免疫浸润模式
  • 需要对健康与疾病区域进行空间层面的比较
  • 提问“肿瘤核心区与边缘区的富集通路有哪些?”
  • 提问“该空间区域中存在哪些细胞间互作?”
不适用场景(请使用其他技能):
  • 无空间背景的单基因解读 -> 使用
    tooluniverse-target-research
  • 变异解读 -> 使用
    tooluniverse-variant-interpretation
  • 药物安全性分析 -> 使用
    tooluniverse-adverse-event-detection
  • 无空间数据的纯疾病分析 -> 使用
    tooluniverse-multiomic-disease-characterization
  • GWAS分析 -> 使用
    tooluniverse-gwas-*
    系列技能
  • bulk RNA-seq(非空间)分析 -> 使用
    tooluniverse-systems-biology

Input Parameters

输入参数

ParameterRequiredDescriptionExample
svgsYesSpatially variable genes (gene symbols)
['EGFR', 'CDH1', 'VIM', 'MYC', 'CD3E']
tissue_typeYesTissue/organ type
brain
,
liver
,
lung
,
breast
,
skin
technologyNoSpatial omics platform used
10x Visium
,
MERFISH
,
DBiTplus
,
SLIDE-seq
disease_contextNoDisease if applicable
breast cancer
,
Alzheimer disease
,
liver cirrhosis
spatial_domainsNoDict mapping domain name to marker genes
{'Tumor core': ['MYC','EGFR'], 'Stroma': ['VIM','COL1A1']}
cell_typesNoCell types identified in deconvolution
['Epithelial', 'T cell', 'Macrophage', 'Fibroblast']
proteinsNoProteins detected (if multi-modal)
['CD3', 'CD8', 'PD-L1', 'Ki67']
metabolitesNoMetabolites detected (if SpatialMETA)
['glutamine', 'lactate', 'ATP']

参数是否必填描述示例
svgs空间可变基因(基因符号)
['EGFR', 'CDH1', 'VIM', 'MYC', 'CD3E']
tissue_type组织/器官类型
brain
,
liver
,
lung
,
breast
,
skin
technology所用的空间多组学平台
10x Visium
,
MERFISH
,
DBiTplus
,
SLIDE-seq
disease_context适用的疾病(如有)
breast cancer
,
Alzheimer disease
,
liver cirrhosis
spatial_domains映射区域名称与标记基因的字典
{'Tumor core': ['MYC','EGFR'], 'Stroma': ['VIM','COL1A1']}
cell_types去卷积识别出的细胞类型
['Epithelial', 'T cell', 'Macrophage', 'Fibroblast']
proteins检测到的蛋白质(多模态数据时)
['CD3', 'CD8', 'PD-L1', 'Ki67']
metabolites检测到的代谢物(SpatialMETA数据时)
['glutamine', 'lactate', 'ATP']

Spatial Omics Integration Score (0-100)

空间多组学整合评分(0-100)

Score Components

评分组成

Data Completeness (0-30 points):
  • SVGs provided (>10 genes): 5 points
  • Disease context provided: 5 points
  • Spatial domains defined: 5 points
  • Cell type composition available: 5 points
  • Multi-modal data (protein/metabolite): 5 points
  • Literature context found: 5 points
Biological Insight (0-40 points):
  • Significant pathway enrichment (FDR < 0.05): 10 points
  • Cell-cell interaction predictions: 10 points
  • Disease mechanism identified: 10 points
  • Druggable targets found in disease regions: 10 points
Evidence Quality (0-30 points):
  • Cross-database validation (gene found in 3+ databases): 10 points
  • Clinical validation (approved drugs for spatial targets): 10 points
  • Literature support (PubMed evidence for spatial patterns): 10 points
数据完整性(0-30分):
  • 提供SVG(>10个基因):5分
  • 提供疾病背景:5分
  • 定义空间区域:5分
  • 提供细胞类型组成:5分
  • 提供多模态数据(蛋白质/代谢物):5分
  • 找到文献背景:5分
生物学洞见(0-40分):
  • 显著通路富集(FDR < 0.05):10分
  • 细胞间互作预测:10分
  • 识别疾病机制:10分
  • 在疾病区域找到可成药靶点:10分
证据质量(0-30分):
  • 跨数据库验证(基因在3个及以上数据库中存在):10分
  • 临床验证(针对空间靶点的已获批药物):10分
  • 文献支持(PubMed中存在空间模式的证据):10分

Score Interpretation

评分解读

ScoreTierInterpretation
80-100ExcellentComprehensive spatial characterization, strong biological insights, druggable targets identified
60-79GoodGood pathway and interaction analysis, some disease/therapeutic context
40-59ModerateBasic enrichment complete, limited spatial domain comparison or interaction analysis
0-39LimitedMinimal data, gene-level annotation only
分数等级解读
80-100优秀全面的空间表征,丰富的生物学洞见,已识别可成药靶点
60-79良好通路与互作分析质量佳,具备部分疾病/治疗背景
40-59中等完成基础富集分析,空间区域比较或互作分析有限
0-39有限数据量极少,仅完成基因层面注释

Evidence Grading System

证据分级体系

TierSymbolCriteriaExamples
T1[T1]Direct human evidence, clinical proofFDA-approved drug for spatial target, validated biomarker
T2[T2]Experimental evidenceValidated spatial pattern in literature, known ligand-receptor pair
T3[T3]Computational/database evidencePPI network prediction, pathway enrichment, expression correlation
T4[T4]Annotation/prediction onlyGO annotation, text-mined association, predicted interaction

等级符号标准示例
T1[T1]直接人类证据,临床验证针对空间靶点的FDA获批药物,经过验证的生物标志物
T2[T2]实验证据文献中已验证的空间模式,已知的配体-受体对
T3[T3]计算/数据库证据PPI网络预测,通路富集,表达相关性
T4[T4]仅注释/预测GO注释,文本挖掘关联,预测的互作

Report Template

报告模板

Create this file structure at the start:
{tissue}_{disease}_spatial_omics_report.md
markdown
undefined
在开始时创建以下文件结构:
{tissue}_{disease}_spatial_omics_report.md
markdown
undefined

Spatial Multi-Omics Analysis Report: {Tissue Type}

空间多组学分析报告:{组织类型}

Report Generated: {date} Technology: {platform} Tissue: {tissue_type} Disease Context: {disease or "Normal tissue"} Total SVGs Analyzed: {count} Spatial Domains: {count} Spatial Omics Integration Score: (to be calculated)

报告生成时间: {日期} 技术平台: {平台} 组织: {tissue_type} 疾病背景: {疾病或“正常组织”} 分析的SVG总数: {数量} 空间区域数量: {数量} 空间多组学整合评分: (待计算)

Executive Summary

执行摘要

(2-3 sentence synthesis of key spatial findings - fill after all phases complete)

(2-3句话总结关键空间发现 - 完成所有阶段后填写)

1. Tissue & Disease Context

1. 组织与疾病背景

Tissue Information

组织信息

PropertyValueSource
Tissue type
Disease
Expected cell typesHPA
属性数值来源
组织类型
疾病
预期细胞类型HPA

Disease Identifiers (if applicable)

疾病标识符(如适用)

SystemIDSource
Sources: (tools used)

系统ID来源
来源: (使用的工具)

2. Spatially Variable Gene Characterization

2. 空间可变基因表征

2.1 Gene ID Resolution

2.1 基因ID解析

Gene SymbolEnsembl IDEntrez IDUniProtFunctionSource
基因符号Ensembl IDEntrez IDUniProt功能来源

2.2 Tissue Expression Patterns

2.2 组织表达模式

GeneTissue ExpressionSpecificitySource
基因组织表达情况特异性来源

2.3 Subcellular Localization

2.3 亚细胞定位

GeneLocationConfidenceSource
基因定位置信度来源

2.4 Disease Associations

2.4 疾病关联

GeneDiseaseScoreEvidenceSource
Sources: (tools used)

基因疾病评分证据来源
来源: (使用的工具)

3. Pathway Enrichment Analysis

3. 通路富集分析

3.1 STRING Functional Enrichment

3.1 STRING功能富集

CategoryTermDescriptionP-valueFDRGenesSource
类别术语描述P值FDR基因来源

3.2 Reactome Pathway Analysis

3.2 Reactome通路分析

Pathway IDNameP-valueFDRGenes FoundTotal GenesSource
通路ID名称P值FDR找到的基因总基因数来源

3.3 GO Biological Processes

3.3 GO生物过程

GO TermDescriptionP-valueFDRGenesSource
GO术语描述P值FDR基因来源

3.4 GO Molecular Functions

3.4 GO分子功能

GO TermDescriptionP-valueFDRGenesSource
GO术语描述P值FDR基因来源

3.5 GO Cellular Components

3.5 GO细胞组分

GO TermDescriptionP-valueFDRGenesSource
GO术语描述P值FDR基因来源

Pathway Summary

通路总结

  • Top enriched pathways:
  • Key biological processes:
  • Spatial pathway implications:
Sources: (tools used)

  • 排名靠前的富集通路:
  • 关键生物过程:
  • 空间通路意义:
来源: (使用的工具)

4. Spatial Domain Characterization

4. 空间区域表征

Domain: {domain_name}

区域: {区域名称}

Marker Genes

标记基因

GeneFunctionPathwaysSource
基因功能通路来源

Enriched Pathways (domain-specific)

富集通路(区域特异性)

PathwayP-valueFDRGenesSource
通路P值FDR基因来源

Cell Type Signature

细胞类型特征

Cell TypeMarker Genes PresentConfidence
细胞类型存在的标记基因置信度

Biological Interpretation

生物学解读

(Narrative interpretation of this domain)
(Repeat for each domain)
(该区域的描述性解读)
(每个区域重复上述内容)

4.N Domain Comparison

4.N 区域比较

FeatureDomain 1Domain 2Domain 3
Top pathway
Cell types
Disease relevance
Sources: (tools used)

特征区域1区域2区域3
顶级通路
细胞类型
疾病相关性
来源: (使用的工具)

5. Cell-Cell Interaction Inference

5. 细胞间互作推断

5.1 Protein-Protein Interactions (STRING)

5.1 蛋白质-蛋白质互作(STRING)

Protein AProtein BScoreTypeSource
蛋白质A蛋白质B评分类型来源

5.2 Ligand-Receptor Pairs

5.2 配体-受体对

LigandReceptorDomain (Ligand)Domain (Receptor)EvidenceSource
配体受体配体所在区域受体所在区域证据来源

5.3 Signaling Pathways

5.3 信号通路

PathwayComponents in DataSpatial DistributionSource
通路数据中存在的组分空间分布来源

5.4 Interaction Network Summary

5.4 互作网络总结

  • Key interaction hubs:
  • Cross-domain interactions:
  • Predicted cell-cell communication axes:
Sources: (tools used)

  • 关键互作枢纽:
  • 跨区域互作:
  • 预测的细胞间通讯轴:
来源: (使用的工具)

6. Disease & Therapeutic Context

6. 疾病与治疗背景

6.1 Disease Gene Overlap

6.1 疾病基因重叠

GeneDisease Association ScoreEvidence TypeSource
基因疾病关联评分证据类型来源

6.2 Druggable Targets in Spatial Domains

6.2 空间区域中的可成药靶点

GeneDomainTractabilityModalityApproved DrugsSource
基因区域成药性模态已获批药物来源

6.3 Drug Mechanisms Relevant to Spatial Targets

6.3 与空间靶点相关的药物机制

DrugTargetMechanismPhaseSource
药物靶点机制研发阶段来源

6.4 Clinical Trials

6.4 临床试验

NCT IDTitleTarget GenePhaseStatusSource
NCT ID标题靶向基因阶段状态来源

Therapeutic Summary

治疗总结

  • Druggable genes in disease regions:
  • Approved therapies:
  • Pipeline drugs:
  • Novel opportunities:
Sources: (tools used)

  • 疾病区域中的可成药基因:
  • 已获批疗法:
  • 在研药物:
  • 新机遇:
来源: (使用的工具)

7. Multi-Modal Integration

7. 多模态整合

7.1 Protein-RNA Concordance (if protein data available)

7.1 蛋白质-RNA一致性(若有蛋白质数据)

Gene/ProteinRNA PatternProtein PatternConcordanceSource
基因/蛋白质RNA模式蛋白质模式一致性来源

7.2 Subcellular Context

7.2 亚细胞背景

GenemRNA Location (spatial)Protein Location (HPA)ConcordanceSource
基因RNA空间定位蛋白质定位(HPA)一致性来源

7.3 Metabolic Context (if metabolomics available)

7.3 代谢背景(若有代谢组数据)

GeneMetabolic PathwayMetabolites DetectedSpatial PatternSource
Sources: (tools used)

基因代谢通路检测到的代谢物空间模式来源
来源: (使用的工具)

8. Immune Microenvironment (if relevant)

8. 免疫微环境(如适用)

8.1 Immune Cell Markers

8.1 免疫细胞标记

Cell TypeMarker GenesSpatial DomainSource
细胞类型标记基因空间区域来源

8.2 Immune Checkpoint Expression

8.2 免疫检查点表达

CheckpointGeneExpression PatternSource
检查点基因表达模式来源

8.3 Tumor-Immune Interface (if cancer)

8.3 肿瘤-免疫界面(若为癌症)

FeatureFindingEvidenceSource
特征发现证据来源

Immune Summary

免疫总结

  • Immune infiltration pattern:
  • Key immune checkpoints:
  • Immunotherapy implications:
Sources: (tools used)

  • 免疫浸润模式:
  • 关键免疫检查点:
  • 免疫治疗意义:
来源: (使用的工具)

9. Literature & Validation Context

9. 文献与验证背景

9.1 Literature Evidence

9.1 文献证据

PMIDTitleRelevanceYearSource
PMID标题相关性年份来源

9.2 Known Spatial Patterns

9.2 已知空间模式

(Known tissue architecture/zonation from literature)
(文献中已知的组织架构/分区)

9.3 Validation Recommendations

9.3 验证建议

PriorityGene/TargetMethodRationale
HighIHC / smFISH
MediumIF / ISH
Sources: (tools used)

优先级基因/靶点方法理由
IHC / smFISH
IF / ISH
来源: (使用的工具)

Spatial Omics Integration Score

空间多组学整合评分

ComponentPointsMaxDetails
SVGs provided5
Disease context5
Spatial domains5
Cell types5
Multi-modal data5
Literature context5
Pathway enrichment10
Cell-cell interactions10
Disease mechanism10
Druggable targets10
Cross-database validation10
Clinical validation10
Literature support10
TOTAL100
Score: XX/100 - [Tier]

组成部分得分满分详情
提供SVG5
提供疾病背景5
定义空间区域5
提供细胞类型5
提供多模态数据5
找到文献背景5
通路富集10
细胞间互作10
疾病机制10
可成药靶点10
跨数据库验证10
临床验证10
文献支持10
总分100
评分: XX/100 - [等级]

Completeness Checklist

完整性检查清单

  • Gene ID resolution complete
  • Tissue expression patterns analyzed (HPA)
  • Subcellular localization checked (HPA)
  • Pathway enrichment complete (STRING + Reactome)
  • GO enrichment complete (BP + MF + CC)
  • Spatial domains characterized individually
  • Domain comparison performed
  • Protein-protein interactions analyzed (STRING)
  • Ligand-receptor pairs identified
  • Disease associations checked (OpenTargets)
  • Druggable targets identified (OpenTargets tractability)
  • Drug mechanisms reviewed
  • Multi-modal integration performed (if data available)
  • Immune microenvironment characterized (if relevant)
  • Literature search completed
  • Validation recommendations provided
  • Spatial Omics Integration Score calculated
  • Executive summary written
  • All sections have source citations

  • 基因ID解析完成
  • 组织表达模式分析完成(HPA)
  • 亚细胞定位检查完成(HPA)
  • 通路富集完成(STRING + Reactome)
  • GO富集完成(BP + MF + CC)
  • 空间区域单独表征完成
  • 区域比较已执行
  • 蛋白质-蛋白质互作分析完成(STRING)
  • 配体-受体对已识别
  • 疾病关联检查完成(OpenTargets)
  • 可成药靶点识别完成(OpenTargets成药性)
  • 药物机制已回顾
  • 多模态整合已执行(若有数据)
  • 免疫微环境表征完成(如适用)
  • 文献检索完成
  • 验证建议已提供
  • 空间多组学整合评分已计算
  • 执行摘要已撰写
  • 所有章节均有来源引用

References

参考文献

Data Sources Used

使用的数据源

#ToolParametersSectionItems Retrieved
#工具参数章节检索到的条目

Database Versions

数据库版本

  • OpenTargets: (current)
  • STRING: v12.0
  • Reactome: (current)
  • HPA: (current)
  • GTEx: v10

---
  • OpenTargets: (当前版本)
  • STRING: v12.0
  • Reactome: (当前版本)
  • HPA: (当前版本)
  • GTEx: v10

---

Phase 0: Input Processing & Disambiguation (ALWAYS FIRST)

阶段0:输入处理与消歧(始终优先执行)

Objective: Parse user input, resolve tissue/disease identifiers, establish analysis context.
目标: 解析用户输入,解析组织/疾病标识符,建立分析背景。

Tools Used

使用的工具

OpenTargets_get_disease_id_description_by_name (if disease context provided):
  • Input:
    diseaseName
    (string) - Disease name
  • Output:
    {data: {search: {hits: [{id, name, description}]}}}
  • Use: Get MONDO/EFO IDs for disease queries
OpenTargets_get_disease_description_by_efoId:
  • Input:
    efoId
    (string) - Disease ID (e.g.,
    MONDO_0007254
    )
  • Output:
    {data: {disease: {id, name, description, dbXRefs}}}
  • Use: Get full disease description
HPA_search_genes_by_query (tissue cell type context):
  • Input:
    query
    (string) - Search term
  • Output: List of gene entries matching query
  • Use: Verify tissue-relevant genes
OpenTargets_get_disease_id_description_by_name(若提供疾病背景):
  • 输入:
    diseaseName
    (字符串)- 疾病名称
  • 输出:
    {data: {search: {hits: [{id, name, description}]}}}
  • 用途: 为疾病查询获取MONDO/EFO ID
OpenTargets_get_disease_description_by_efoId:
  • 输入:
    efoId
    (字符串)- 疾病ID(例如
    MONDO_0007254
  • 输出:
    {data: {disease: {id, name, description, dbXRefs}}}
  • 用途: 获取完整的疾病描述
HPA_search_genes_by_query(组织细胞类型背景):
  • 输入:
    query
    (字符串)- 搜索词
  • 输出: 匹配查询的基因条目列表
  • 用途: 验证与组织相关的基因

Workflow

工作流程

  1. Parse SVG list from user input (ensure valid gene symbols)
  2. Identify tissue type and map to standard ontology term
  3. If disease provided, resolve to MONDO/EFO ID using OpenTargets
  4. Get disease description and cross-references
  5. Determine analysis scope:
    • Cancer? -> Include immune microenvironment, somatic mutations, druggable targets
    • Neurological? -> Include brain region specificity, neuronal markers
    • Metabolic? -> Include metabolic zonation, enzyme distribution
    • Normal tissue? -> Focus on tissue architecture and cell type composition
  6. Set up report file with header information
  1. 从用户输入中解析SVG列表(确保为有效的基因符号)
  2. 识别组织类型并映射到标准本体术语
  3. 若提供疾病信息,使用OpenTargets解析为MONDO/EFO ID
  4. 获取疾病描述和交叉引用
  5. 确定分析范围:
    • 是否为癌症?-> 纳入免疫微环境、体细胞突变、可成药靶点分析
    • 是否为神经系统疾病?-> 纳入脑区特异性、神经元标记分析
    • 是否为代谢性疾病?-> 纳入代谢分区、酶分布分析
    • 是否为正常组织?-> 聚焦组织架构和细胞类型组成
  6. 建立包含头部信息的报告文件

Decision Logic

决策逻辑

  • Cancer tissue: Enable immune microenvironment phase, CIViC/cBioPortal queries, immuno-oncology analysis
  • Normal tissue: Skip disease phases, focus on tissue zonation and cell type composition
  • Liver/kidney/brain: Enable zonation-specific analysis
  • No disease context: Proceed with tissue biology only
  • Small gene list (<20): Warn about limited enrichment power, emphasize gene-level analysis
  • Large gene list (>500): Suggest filtering to top SVGs by significance before enrichment

  • 癌症组织: 启用免疫微环境阶段、CIViC/cBioPortal查询、肿瘤免疫分析
  • 正常组织: 跳过疾病相关阶段,聚焦组织分区和细胞类型组成
  • 肝脏/肾脏/大脑: 启用分区特异性分析
  • 无疾病背景: 仅针对组织生物学进行分析
  • 基因列表较小(<20个): 提示富集能力有限,强调基因层面分析
  • 基因列表较大(>500个): 建议按显著性筛选出排名靠前的SVG后再进行富集

Phase 1: Gene Characterization

阶段1:基因表征

Objective: Resolve gene identifiers, annotate functions, tissue specificity, and subcellular localization.
目标: 解析基因标识符,注释功能、组织特异性和亚细胞定位。

Tools Used

使用的工具

MyGene_query_genes (gene ID resolution):
  • Input:
    query
    (string) - Gene symbol
  • Output:
    {hits: [{_id, symbol, name, ensembl: {gene}, entrezgene}]}
  • Use: Resolve gene symbol to Ensembl ID, Entrez ID
  • NOTE: First hit may not be exact match - filter by
    symbol
    field
UniProt_get_function_by_accession (gene function):
  • Input:
    accession
    (string) - UniProt accession
  • Output: List of function description strings
  • Use: Get protein function annotation
UniProt_get_subcellular_location_by_accession (protein localization):
  • Input:
    accession
    (string)
  • Output: Subcellular location information
  • Use: Where the protein is located in the cell
HPA_get_subcellular_location (validated localization):
  • Input:
    gene_name
    (string) - Gene symbol
  • Output:
    {gene_name, main_locations: [], additional_locations: [], location_summary}
  • Use: Experimentally validated protein subcellular location
HPA_get_rna_expression_by_source (tissue expression):
  • Input:
    gene_name
    (string),
    source_type
    (string: 'tissue'),
    source_name
    (string)
  • Output:
    {data: {gene_name, source_type, source_name, expression_value, expression_level}}
  • Use: Check expression in the specific tissue of interest
  • NOTE: All 3 parameters are REQUIRED
HPA_get_comprehensive_gene_details_by_ensembl_id (full HPA data):
  • Input:
    ensembl_id
    (string),
    include_isoforms
    (bool),
    include_images
    (bool),
    include_antibodies
    (bool),
    include_expression
    (bool) - ALL 5 parameters REQUIRED
  • Output:
    {ensembl_id, gene_name, uniprot_ids, summary, protein_classes, tissue_expression, cell_line_expression, ...}
  • Use: One-stop gene characterization from HPA
  • NOTE: Use
    include_expression=True
    for tissue data; set others to
    False
    for faster response
HPA_get_cancer_prognostics_by_gene (cancer prognosis):
  • Input:
    ensembl_id
    (string) - Ensembl gene ID (NOT gene_name)
  • Output:
    {gene_name, prognostic_cancers_count, prognostic_summary: [{cancer_type, prognostic_type, p_value}]}
  • Use: Prognostic significance in cancer (if cancer context)
UniProtIDMap_gene_to_uniprot (ID mapping):
  • Input:
    gene_name
    (string),
    organism
    (string, default 'human')
  • Output: UniProt accession for the gene
  • Use: Map gene symbol to UniProt accession
MyGene_query_genes(基因ID解析):
  • 输入:
    query
    (字符串)- 基因符号
  • 输出:
    {hits: [{_id, symbol, name, ensembl: {gene}, entrezgene}]}
  • 用途: 将基因符号解析为Ensembl ID、Entrez ID
  • 注意: 第一个结果可能不是精确匹配 - 需按
    symbol
    字段过滤
UniProt_get_function_by_accession(基因功能):
  • 输入:
    accession
    (字符串)- UniProt登录号
  • 输出: 功能描述字符串列表
  • 用途: 获取蛋白质功能注释
UniProt_get_subcellular_location_by_accession(蛋白质定位):
  • 输入:
    accession
    (字符串)
  • 输出: 亚细胞定位信息
  • 用途: 确定蛋白质在细胞中的位置
HPA_get_subcellular_location(已验证的定位):
  • 输入:
    gene_name
    (字符串)- 基因符号
  • 输出:
    {gene_name, main_locations: [], additional_locations: [], location_summary}
  • 用途: 获取经实验验证的蛋白质亚细胞定位
HPA_get_rna_expression_by_source(组织表达):
  • 输入:
    gene_name
    ,
    source_type
    (字符串),
    source_name
    (字符串)
  • 输出:
    {data: {gene_name, source_type, source_name, expression_value, expression_level}}
  • 用途: 检查目标组织中的表达情况
  • 注意: 所有3个参数均为必填项
HPA_get_comprehensive_gene_details_by_ensembl_id(完整HPA数据):
  • 输入:
    ensembl_id
    (字符串),
    include_isoforms
    (布尔值),
    include_images
    (布尔值),
    include_antibodies
    (布尔值),
    include_expression
    (布尔值)- 所有5个参数均为必填项
  • 输出:
    {ensembl_id, gene_name, uniprot_ids, summary, protein_classes, tissue_expression, cell_line_expression, ...}
  • 用途: 从HPA一站式获取基因表征数据
  • 注意: 若需要组织数据,设置
    include_expression=True
    ;其他参数设为
    False
    以加快响应速度
HPA_get_cancer_prognostics_by_gene(癌症预后):
  • 输入:
    ensembl_id
    (字符串)- Ensembl基因ID(非基因名称)
  • 输出:
    {gene_name, prognostic_cancers_count, prognostic_summary: [{cancer_type, prognostic_type, p_value}]}
  • 用途: 分析基因在癌症中的预后意义(若有癌症背景)
UniProtIDMap_gene_to_uniprot(ID映射):
  • 输入:
    gene_name
    (字符串),
    organism
    (字符串,默认值'human')
  • 输出: 基因对应的UniProt登录号
  • 用途: 将基因符号映射为UniProt登录号

Workflow

工作流程

  1. For each SVG (batch if >20, sample top genes): a. Query MyGene to get Ensembl ID, Entrez ID b. Map to UniProt accession c. Get subcellular location from HPA d. Get tissue expression from HPA e. If cancer: check cancer prognostics
  2. Compile gene characterization table
  3. Identify genes with tissue-specific expression
  4. Note genes with nuclear vs membrane vs secreted localization (relevant for spatial patterns)
  1. 针对每个SVG(若>20个则批量处理,选取排名靠前的基因): a. 查询MyGene获取Ensembl ID、Entrez ID b. 映射为UniProt登录号 c. 从HPA获取亚细胞定位 d. 从HPA获取组织表达情况 e. 若为癌症背景:检查癌症预后意义
  2. 整理基因表征表格
  3. 识别具有组织特异性表达的基因
  4. 记录基因的核定位、膜定位或分泌定位(与空间模式相关)

Batch Strategy for Large Gene Lists

大基因列表的批量策略

  • 10-50 genes: Characterize all individually
  • 50-200 genes: Characterize top 50 by priority (known disease genes first), summarize rest
  • 200+ genes: Characterize top 30, use enrichment for the full list
  • Always run pathway enrichment on the FULL list regardless

  • 10-50个基因: 逐个表征所有基因
  • 50-200个基因: 优先表征排名前50的基因(已知疾病基因优先),总结其余基因
  • 200+个基因: 表征排名前30的基因,对完整列表进行富集分析
  • 无论基因数量多少,始终对完整列表进行通路富集分析

Phase 2: Pathway & Functional Enrichment

阶段2:通路与功能富集

Objective: Identify biological pathways and functions enriched in SVGs and per-domain gene sets.
目标: 识别SVG和各区域基因集中富集的生物学通路与功能。

Tools Used

使用的工具

STRING_functional_enrichment (primary enrichment):
  • Input:
    protein_ids
    (array of gene symbols),
    species
    (int, 9606 for human)
  • Output:
    {status: 'success', data: [{category, term, number_of_genes, number_of_genes_in_background, p_value, fdr, description, inputGenes, preferredNames}]}
  • Use: Comprehensive enrichment across GO, KEGG, Reactome, COMPARTMENTS, DISEASES
  • Categories:
    Process
    (GO:BP),
    Function
    (GO:MF),
    Component
    (GO:CC),
    KEGG
    ,
    Reactome
    ,
    COMPARTMENTS
    ,
    DISEASES
    ,
    Keyword
    ,
    PMID
  • NOTE: This is the PRIMARY enrichment tool. Returns all categories in one call
ReactomeAnalysis_pathway_enrichment (Reactome-specific):
  • Input:
    identifiers
    (string, space-separated gene symbols, NOT array)
  • Output:
    {data: {token, pathways_found, pathways: [{pathway_id, name, p_value, fdr, entities_found, entities_total}]}}
  • Use: Detailed Reactome pathway analysis with hierarchy
  • NOTE: identifiers is a SPACE-SEPARATED STRING, not array
Reactome_map_uniprot_to_pathways (individual gene):
  • Input:
    id
    (string) - UniProt accession
  • Output: Plain list of pathway objects (no data wrapper)
  • Use: Map individual proteins to Reactome pathways
GO_get_annotations_for_gene (individual gene GO):
  • Input:
    gene_id
    (string) - Gene symbol or ID
  • Output: Plain list of GO annotation objects
  • Use: Get GO annotations for individual genes
kegg_search_pathway (KEGG pathway search):
  • Input:
    query
    (string) - Pathway name or keyword
  • Output: Pathway search results
  • Use: Find KEGG pathways relevant to spatial findings
WikiPathways_search (WikiPathways):
  • Input:
    query
    (string) - Search term
  • Output: WikiPathways search results
  • Use: Additional pathway context
STRING_functional_enrichment(主要富集工具):
  • 输入:
    protein_ids
    (基因符号数组),
    species
    (整数,人类为9606)
  • 输出:
    {status: 'success', data: [{category, term, number_of_genes, number_of_genes_in_background, p_value, fdr, description, inputGenes, preferredNames}]}
  • 用途: 针对GO、KEGG、Reactome、COMPARTMENTS、DISEASES进行全面富集分析
  • 类别:
    Process
    (GO:BP)、
    Function
    (GO:MF)、
    Component
    (GO:CC)、
    KEGG
    Reactome
    COMPARTMENTS
    DISEASES
    Keyword
    PMID
  • 注意: 这是主要的富集工具,一次调用可返回所有类别结果
ReactomeAnalysis_pathway_enrichment(Reactome特异性分析):
  • 输入:
    identifiers
    (字符串,空格分隔的基因符号,非数组)
  • 输出:
    {data: {token, pathways_found, pathways: [{pathway_id, name, p_value, fdr, entities_found, entities_total}]}}
  • 用途: 进行带有层级结构的详细Reactome通路分析
  • 注意: identifiers为空格分隔的字符串,而非数组
Reactome_map_uniprot_to_pathways(单个基因):
  • 输入:
    id
    (字符串)- UniProt登录号
  • 输出: 通路对象的纯列表(无数据包装)
  • 用途: 将单个蛋白质映射到Reactome通路
GO_get_annotations_for_gene(单个基因的GO注释):
  • 输入:
    gene_id
    (字符串)- 基因符号或ID
  • 输出: GO注释对象的纯列表
  • 用途: 获取单个基因的GO注释
kegg_search_pathway(KEGG通路搜索):
  • 输入:
    query
    (字符串)- 通路名称或关键词
  • 输出: KEGG通路搜索结果
  • 用途: 找到与空间发现相关的KEGG通路
WikiPathways_search(WikiPathways):
  • 输入:
    query
    (字符串)- 搜索词
  • 输出: WikiPathways搜索结果
  • 用途: 获取额外的通路背景

Workflow

工作流程

  1. Global SVG enrichment: Run STRING_functional_enrichment on ALL SVGs
    • Filter results by FDR < 0.05
    • Separate by category (Process, Function, Component, KEGG, Reactome)
    • Report top 10-15 per category
  2. Reactome detailed analysis: Run ReactomeAnalysis_pathway_enrichment
    • Report top pathways with FDR < 0.05
  3. Per-domain enrichment (if spatial domains provided):
    • Run STRING_functional_enrichment on each domain's gene set
    • Compare enriched pathways across domains
    • Identify domain-specific vs shared pathways
  4. Compile pathway tables: Merge results from all enrichment tools
  1. 全局SVG富集: 对所有SVG运行
    STRING_functional_enrichment
    • 按FDR < 0.05过滤结果
    • 按类别(Process、Function、Component、KEGG、Reactome)分类
    • 每个类别报告排名前10-15的结果
  2. Reactome详细分析: 运行
    ReactomeAnalysis_pathway_enrichment
    • 报告FDR < 0.05的顶级通路
  3. 各区域富集(若提供空间区域):
    • 对每个区域的基因集运行
      STRING_functional_enrichment
    • 比较各区域的富集通路
    • 识别区域特异性通路与共享通路
  4. 整理通路表格: 合并所有富集工具的结果

Enrichment Interpretation

富集解读

  • Signaling pathways (RTK, Wnt, Notch, Hedgehog): Cell-cell communication
  • Metabolic pathways: Tissue metabolic zonation
  • Immune pathways: Immune infiltration/exclusion
  • ECM/adhesion pathways: Tissue structure and remodeling
  • Cell cycle/proliferation: Growth zones
  • Apoptosis/stress: Damage zones

  • 信号通路(RTK、Wnt、Notch、Hedgehog): 细胞间通讯
  • 代谢通路: 组织代谢分区
  • 免疫通路: 免疫浸润/排斥
  • ECM/黏附通路: 组织结构与重塑
  • 细胞周期/增殖: 生长区域
  • 凋亡/应激: 损伤区域

Phase 3: Spatial Domain Characterization

阶段3:空间区域表征

Objective: Characterize each spatial domain biologically and compare between domains.
目标: 从生物学角度表征每个空间区域,并进行区域间比较。

Tools Used

使用的工具

Uses the same tools as Phase 2 (STRING_functional_enrichment, ReactomeAnalysis) applied per-domain, plus:
HPA_get_biological_processes_by_gene (per-gene processes):
  • Input:
    gene_name
    (string)
  • Output: Biological processes associated with the gene
  • Use: Annotate domain marker genes
HPA_get_protein_interactions_by_gene (gene interactions):
  • Input:
    gene_name
    (string)
  • Output: Known protein interaction partners
  • Use: Build domain-specific interaction context
使用与阶段2相同的工具(
STRING_functional_enrichment
ReactomeAnalysis
)并应用于各区域,此外还有:
HPA_get_biological_processes_by_gene(单个基因的生物学过程):
  • 输入:
    gene_name
    (字符串)
  • 输出: 与基因相关的生物学过程
  • 用途: 注释区域标记基因
HPA_get_protein_interactions_by_gene(基因互作):
  • 输入:
    gene_name
    (字符串)
  • 输出: 已知的蛋白质互作伙伴
  • 用途: 构建区域特异性互作背景

Workflow

工作流程

  1. For each spatial domain: a. Get marker gene list b. Run STRING_functional_enrichment on domain genes c. Identify top pathways, GO terms d. Assign likely cell type(s) based on marker genes:
    • Epithelial: CDH1, EPCAM, KRT18, KRT19
    • Mesenchymal/Fibroblast: VIM, COL1A1, COL3A1, FAP, ACTA2
    • Immune T cell: CD3E, CD3D, CD4, CD8A, CD8B
    • Immune B cell: CD19, CD20 (MS4A1), CD79A
    • Macrophage: CD68, CD163, CSF1R
    • Endothelial: PECAM1, VWF, CDH5
    • Neuronal: SNAP25, SYP, MAP2, NEFL
    • Hepatocyte: ALB, HNF4A, CYP3A4 e. Generate biological interpretation narrative
  2. Compare domains:
    • Differential pathways
    • Unique vs shared genes
    • Disease-relevant vs homeostatic regions
    • Transition zones (shared genes between adjacent domains)
  1. 针对每个空间区域: a. 获取标记基因列表 b. 对区域基因集运行
    STRING_functional_enrichment
    c. 识别顶级通路、GO术语 d. 根据标记基因分配可能的细胞类型:
    • 上皮细胞: CDH1、EPCAM、KRT18、KRT19
    • 间充质/成纤维细胞: VIM、COL1A1、COL3A1、FAP、ACTA2
    • 免疫T细胞: CD3E、CD3D、CD4、CD8A、CD8B
    • 免疫B细胞: CD19、CD20(MS4A1)、CD79A
    • 巨噬细胞: CD68、CD163、CSF1R
    • 内皮细胞: PECAM1、VWF、CDH5
    • 神经元: SNAP25、SYP、MAP2、NEFL
    • 肝细胞: ALB、HNF4A、CYP3A4 e. 生成生物学解读描述
  2. 区域比较:
    • 差异通路
    • 独特基因与共享基因
    • 疾病相关区域与稳态区域
    • 过渡区域(相邻区域的共享基因)

Cell Type Assignment Rules

细胞类型分配规则

When user does not provide cell type annotations, infer from marker genes:
  • Check each gene against known cell type markers
  • Use HPA tissue/cell type expression data for validation
  • Report confidence level (high: 3+ markers match, medium: 2 markers, low: 1 marker)

当用户未提供细胞类型注释时,根据标记基因推断:
  • 检查每个基因是否与已知细胞类型标记匹配
  • 使用HPA组织/细胞类型表达数据进行验证
  • 报告置信度等级(高: 3个及以上标记匹配,中: 2个标记匹配,低: 1个标记匹配)

Phase 4: Cell-Cell Interaction Inference

阶段4:细胞间互作推断

Objective: Predict cell-cell communication from spatial gene expression patterns.
目标: 从空间基因表达模式预测细胞间通讯。

Tools Used

使用的工具

STRING_get_interaction_partners (PPI network):
  • Input:
    protein_ids
    (array),
    species
    (int, 9606),
    limit
    (int),
    confidence_score
    (float, 0.7)
  • Output:
    {status: 'success', data: [{preferredName_A, preferredName_B, score, nscore, fscore, pscore, ascore, escore, dscore, tscore}]}
  • Use: Find protein-protein interactions among SVGs
  • Score types: nscore=neighborhood, fscore=fusion, pscore=phylogenetic, ascore=coexpression, escore=experimental, dscore=database, tscore=textmining
STRING_get_protein_interactions (pairwise interactions):
  • Input:
    protein_ids
    (array),
    species
    (int, 9606)
  • Output: Interaction data between specified proteins
  • Use: Get interactions within a specific gene set
intact_search_interactions (IntAct database):
  • Input:
    query
    (string),
    max
    (int)
  • Output: Interaction data from IntAct
  • Use: Complement STRING with IntAct interactions
Reactome_get_interactor (Reactome interactions):
  • Input: Protein/gene identifier
  • Output: Reactome interaction data
  • Use: Pathway-level interaction context
DGIdb_get_drug_gene_interactions (drug-gene interactions):
  • Input:
    genes
    (array of strings)
  • Output: Drug-gene interaction data
  • Use: Identify druggable interaction nodes
STRING_get_interaction_partners(PPI网络):
  • 输入:
    protein_ids
    (数组),
    species
    (整数,9606),
    limit
    (整数),
    confidence_score
    (浮点数,0.7)
  • 输出:
    {status: 'success', data: [{preferredName_A, preferredName_B, score, nscore, fscore, pscore, ascore, escore, dscore, tscore}]}
  • 用途: 查找SVG之间的蛋白质-蛋白质互作
  • 评分类型: nscore=邻域评分, fscore=融合评分, pscore=系统发育评分, ascore=共表达评分, escore=实验评分, dscore=数据库评分, tscore=文本挖掘评分
STRING_get_protein_interactions(成对互作):
  • 输入:
    protein_ids
    (数组),
    species
    (整数,9606)
  • 输出: 指定蛋白质之间的互作数据
  • 用途: 获取特定基因集内的互作
intact_search_interactions(IntAct数据库):
  • 输入:
    query
    (字符串),
    max
    (整数)
  • 输出: IntAct数据库中的互作数据
  • 用途: 用IntAct互作补充STRING结果
Reactome_get_interactor(Reactome互作):
  • 输入: 蛋白质/基因标识符
  • 输出: Reactome互作数据
  • 用途: 获取通路层面的互作背景
DGIdb_get_drug_gene_interactions(药物-基因互作):
  • 输入:
    genes
    (字符串数组)
  • 输出: 药物-基因互作数据
  • 用途: 识别可成药的互作节点

Ligand-Receptor Analysis

配体-受体分析

Known ligand-receptor pairs to check in SVG list:
  • Growth factors: EGF-EGFR, HGF-MET, VEGF-KDR, FGF-FGFR, PDGF-PDGFRA/B
  • Cytokines: TNF-TNFR, IL6-IL6R, IFNG-IFNGR, TGFB1-TGFBR1/2
  • Chemokines: CXCL12-CXCR4, CCL2-CCR2, CXCL10-CXCR3
  • Immune checkpoints: CD274(PD-L1)-PDCD1(PD-1), CD80/CD86-CTLA4, LGALS9-HAVCR2(TIM-3)
  • Notch signaling: DLL1/3/4-NOTCH1/2/3/4, JAG1/2-NOTCH1/2
  • Wnt signaling: WNT ligands-FZD receptors
  • Adhesion: CDH1-CDH1 (homotypic), ITGA/B integrins-ECM
  • Hedgehog: SHH-PTCH1
需要在SVG列表中检查的已知配体-受体对:
  • 生长因子: EGF-EGFR、HGF-MET、VEGF-KDR、FGF-FGFR、PDGF-PDGFRA/B
  • 细胞因子: TNF-TNFR、IL6-IL6R、IFNG-IFNGR、TGFB1-TGFBR1/2
  • 趋化因子: CXCL12-CXCR4、CCL2-CCR2、CXCL10-CXCR3
  • 免疫检查点: CD274(PD-L1)-PDCD1(PD-1)、CD80/CD86-CTLA4、LGALS9-HAVCR2(TIM-3)
  • Notch信号: DLL1/3/4-NOTCH1/2/3/4、JAG1/2-NOTCH1/2
  • Wnt信号: WNT配体-FZD受体
  • 黏附: CDH1-CDH1(同型)、ITGA/B整合素-ECM
  • Hedgehog: SHH-PTCH1

Workflow

工作流程

  1. Run STRING_get_interaction_partners on all SVGs
    • Filter interactions with score > 0.7
    • Identify hub genes (most connections)
  2. Check for known ligand-receptor pairs in gene list
    • Cross-reference with spatial domain assignments
    • Identify potential cross-domain signaling
  3. Build interaction network:
    • Intra-domain interactions (within same spatial region)
    • Inter-domain interactions (between different regions)
    • Identify signaling axes (e.g., tumor-stroma, immune-tumor)
  4. Map interactions to Reactome signaling pathways

  1. 对所有SVG运行
    STRING_get_interaction_partners
    • 过滤评分>0.7的互作
    • 识别枢纽基因(连接数最多的基因)
  2. 检查基因列表中的已知配体-受体对
    • 结合空间区域分配进行交叉参考
    • 识别潜在的跨区域信号传导
  3. 构建互作网络:
    • 区域内互作(同一空间区域内)
    • 区域间互作(不同区域之间)
    • 识别信号轴(如肿瘤-基质、免疫-肿瘤)
  4. 将互作映射到Reactome信号通路

Phase 5: Disease & Therapeutic Context

阶段5:疾病与治疗背景

Objective: Connect spatial findings to disease mechanisms and identify druggable targets.
目标: 将空间发现与疾病机制关联,识别可成药靶点。

Tools Used

使用的工具

OpenTargets_get_associated_targets_by_disease_efoId (disease genes):
  • Input:
    efoId
    (string),
    size
    (int)
  • Output:
    {data: {disease: {associatedTargets: {count, rows: [{target: {id, approvedSymbol}, score}]}}}}
  • Use: Get disease-associated genes, overlap with SVGs
OpenTargets_get_target_tractability_by_ensemblID (druggability):
  • Input:
    ensemblId
    (string)
  • Output: Tractability data (small molecule, antibody, other modalities)
  • Use: Assess if spatial targets are druggable
OpenTargets_get_associated_drugs_by_target_ensemblID (drugs for target):
  • Input:
    ensemblId
    (string),
    size
    (int)
  • Output: Drug data for the target
  • Use: Find approved/clinical drugs targeting spatial genes
OpenTargets_get_drug_mechanisms_of_action_by_chemblId (drug mechanism):
  • Input:
    chemblId
    (string)
  • Output: Mechanism of action data
  • Use: Understand how drugs act on spatial targets
OpenTargets_target_disease_evidence (evidence linking target to disease):
  • Input:
    ensemblId
    (string),
    efoId
    (string)
  • Output: Evidence items linking target to disease
  • Use: Specific evidence for each spatial gene in disease
clinical_trials_search (clinical trials):
  • Input:
    action
    =
    "search_studies"
    ,
    condition
    (string),
    intervention
    (string),
    limit
    (int)
  • Output:
    {total_count, studies: [{nctId, title, status, conditions}]}
  • Use: Find clinical trials for spatial targets
  • NOTE:
    action
    MUST be
    "search_studies"
DGIdb_get_gene_druggability (druggability categories):
  • Input:
    genes
    (array of strings)
  • Output:
    {data: {genes: {nodes: [{name, geneCategories: [{name}]}]}}}
  • Use: Classify genes as druggable, kinase, GPCR, etc.
civic_search_genes (CIViC cancer evidence, if cancer):
  • Input: (no filter by name)
  • Output: Gene list from CIViC
  • Use: Check if SVGs have CIViC clinical evidence
OpenTargets_get_associated_targets_by_disease_efoId(疾病基因):
  • 输入:
    efoId
    (字符串),
    size
    (整数)
  • 输出:
    {data: {disease: {associatedTargets: {count, rows: [{target: {id, approvedSymbol}, score}]}}}}
  • 用途: 获取疾病相关基因,与SVG取交集
OpenTargets_get_target_tractability_by_ensemblID(成药性):
  • 输入:
    ensemblId
    (字符串)
  • 输出: 成药性数据(小分子、抗体、其他模态)
  • 用途: 评估空间靶点的成药性
OpenTargets_get_associated_drugs_by_target_ensemblID(靶点相关药物):
  • 输入:
    ensemblId
    (字符串),
    size
    (整数)
  • 输出: 靶点相关药物数据
  • 用途: 找到针对空间基因的已获批/临床阶段药物
OpenTargets_get_drug_mechanisms_of_action_by_chemblId(药物机制):
  • 输入:
    chemblId
    (字符串)
  • 输出: 作用机制数据
  • 用途: 理解药物作用于空间靶点的机制
OpenTargets_target_disease_evidence(靶点与疾病关联的证据):
  • 输入:
    ensemblId
    (字符串),
    efoId
    (字符串)
  • 输出: 连接靶点与疾病的证据条目
  • 用途: 获取每个空间基因在疾病中的具体证据
clinical_trials_search(临床试验):
  • 输入:
    action
    =
    "search_studies"
    ,
    condition
    (字符串),
    intervention
    (字符串),
    limit
    (整数)
  • 输出:
    {total_count, studies: [{nctId, title, status, conditions}]}
  • 用途: 查找针对空间靶点的临床试验
  • 注意:
    action
    必须设为
    "search_studies"
DGIdb_get_gene_druggability(成药性分类):
  • 输入:
    genes
    (字符串数组)
  • 输出:
    {data: {genes: {nodes: [{name, geneCategories: [{name}]}]}}}
  • 用途: 将基因分类为可成药、激酶、GPCR等类型
civic_search_genes(CIViC癌症证据,若为癌症):
  • 输入: (无名称过滤)
  • 输出: CIViC中的基因列表
  • 用途: 检查SVG是否有CIViC临床证据

Workflow

工作流程

  1. Disease gene overlap (if disease context provided): a. Get disease-associated targets from OpenTargets b. Intersect with SVGs c. For overlapping genes, get specific evidence
  2. Druggable target identification: a. Run DGIdb_get_gene_druggability on all SVGs b. For druggable genes, check OpenTargets tractability c. Get approved drugs for druggable spatial targets
  3. Clinical trials: a. Search for trials targeting spatial genes in the disease context b. Prioritize trials for genes in disease-enriched spatial domains
  4. Cancer-specific (if cancer): a. Check CIViC for clinical evidence b. Get mutation prevalence from cBioPortal (if specific mutations known) c. Check immune checkpoint genes in spatial data

  1. 疾病基因交集(若提供疾病背景): a. 从OpenTargets获取疾病相关靶点 b. 与SVG取交集 c. 对交集基因获取具体证据
  2. 可成药靶点识别: a. 对所有SVG运行
    DGIdb_get_gene_druggability
    b. 对可成药基因,检查OpenTargets成药性 c. 获取针对可成药空间靶点的已获批药物
  3. 临床试验: a. 在疾病背景下搜索针对空间基因的试验 b. 优先关注疾病富集区域中基因的试验
  4. 癌症特异性分析(若为癌症): a. 检查CIViC中的临床证据 b. 从cBioPortal获取突变频率(若已知特定突变) c. 检查空间数据中的免疫检查点基因

Phase 6: Multi-Modal Integration

阶段6:多模态整合

Objective: Integrate protein, RNA, and metabolite spatial data when available.
目标: 若有可用数据,整合蛋白质、RNA和代谢物空间数据。

Tools Used

使用的工具

HPA_get_subcellular_location (protein localization):
  • Input:
    gene_name
    (string)
  • Output:
    {gene_name, main_locations, additional_locations, location_summary}
  • Use: Compare mRNA spatial pattern with protein subcellular location
HPA_get_rna_expression_in_specific_tissues (tissue RNA):
  • Input:
    ensembl_id
    (string),
    tissue_name
    (string)
  • Output: Expression data for specific tissue
  • Use: Validate spatial expression against bulk tissue data
Reactome_map_uniprot_to_pathways (metabolic pathways):
  • Input:
    id
    (string) - UniProt accession
  • Output: List of pathways
  • Use: Map genes to metabolic pathways for metabolomics integration
kegg_get_pathway_info (KEGG pathway details):
  • Input:
    pathway_id
    (string) - KEGG pathway ID
  • Output: Pathway information including metabolites
  • Use: Link spatial genes to metabolic pathways and metabolites
HPA_get_subcellular_location(蛋白质定位):
  • 输入:
    gene_name
    (字符串)
  • 输出:
    {gene_name, main_locations, additional_locations, location_summary}
  • 用途: 比较mRNA空间模式与蛋白质亚细胞定位
HPA_get_rna_expression_in_specific_tissues(组织RNA表达):
  • 输入:
    ensembl_id
    (字符串),
    tissue_name
    (字符串)
  • 输出: 特定组织的表达数据
  • 用途: 验证空间表达与 bulk 组织数据的一致性
Reactome_map_uniprot_to_pathways(代谢通路):
  • 输入:
    id
    (字符串)- UniProt登录号
  • 输出: 通路列表
  • 用途: 将基因映射到代谢通路以进行代谢组整合
kegg_get_pathway_info(KEGG通路详情):
  • 输入:
    pathway_id
    (字符串)- KEGG通路ID
  • 输出: 包含代谢物的通路信息
  • 用途: 将空间基因与代谢通路和代谢物关联

Workflow

工作流程

  1. RNA-Protein concordance (if protein data provided): a. For each gene with both RNA and protein data:
    • Compare spatial RNA pattern with protein detection
    • Check HPA for known post-transcriptional regulation
    • Note concordant (expected) vs discordant (interesting) patterns
  2. Subcellular context: a. Map spatial RNA localization to protein subcellular location (HPA) b. Secreted proteins -> likely paracrine signaling c. Membrane proteins -> cell surface markers d. Nuclear proteins -> transcription factors
  3. Metabolic integration (if metabolomics available): a. Map genes to metabolic pathways (Reactome, KEGG) b. Link detected metabolites to enzyme-encoding genes c. Identify spatial metabolic heterogeneity d. Check for known metabolic zonation patterns

  1. RNA-蛋白质一致性(若有蛋白质数据): a. 对同时有RNA和蛋白质数据的每个基因:
    • 比较RNA空间模式与蛋白质检测结果
    • 检查HPA中已知的转录后调控
    • 记录一致(预期)与不一致(值得关注)的模式
  2. 亚细胞背景: a. 将RNA空间定位与蛋白质亚细胞定位(HPA)关联 b. 分泌蛋白 -> 可能为旁分泌信号 c. 膜蛋白 -> 细胞表面标记 d. 核蛋白 -> 转录因子
  3. 代谢整合(若有代谢组数据): a. 将基因映射到代谢通路(Reactome、KEGG) b. 将检测到的代谢物与编码酶的基因关联 c. 识别空间代谢异质性 d. 检查已知的代谢分区模式

Phase 7: Immune Microenvironment (Cancer/Inflammation)

阶段7:免疫微环境(癌症/炎症)

Objective: Characterize immune cell composition and checkpoint expression in spatial context.
目标: 在空间背景下表征免疫细胞组成与检查点表达。

Conditions for Activation

激活条件

Only execute if:
  • Disease context is cancer, autoimmune, or inflammatory
  • SVGs include immune markers (CD3E, CD8A, CD68, CD163, etc.)
  • User specifically asks about immune patterns
仅在以下情况执行:
  • 疾病背景为癌症、自身免疫病或炎症
  • SVG包含免疫标记(CD3E、CD8A、CD68、CD163等)
  • 用户专门询问免疫模式

Tools Used

使用的工具

STRING_functional_enrichment (immune pathway enrichment):
  • Applied to immune-relevant SVGs
  • Filter for immune-related GO terms and pathways
OpenTargets_get_target_tractability_by_ensemblID (checkpoint druggability):
  • Applied to immune checkpoint genes
  • Check for approved immunotherapies
iedb_search_epitopes (epitope data):
  • Input:
    organism_name
    (string),
    source_antigen_name
    (string)
  • Output:
    {status, data, count}
  • Use: Check if spatial antigens have known epitopes
STRING_functional_enrichment(免疫通路富集):
  • 应用于免疫相关SVG
  • 过滤免疫相关GO术语与通路
OpenTargets_get_target_tractability_by_ensemblID(检查点成药性):
  • 应用于免疫检查点基因
  • 检查已获批的免疫疗法
iedb_search_epitopes(表位数据):
  • 输入:
    organism_name
    (字符串),
    source_antigen_name
    (字符串)
  • 输出:
    {status, data, count}
  • 用途: 检查空间抗原是否有已知表位

Immune Cell Markers Reference

免疫细胞标记参考

Cell TypeKey MarkersExtended Markers
CD8+ T cellCD8A, CD8BGZMA, GZMB, PRF1, IFNG
CD4+ T cellCD4IL2, IL4, IL17A, FOXP3 (Treg)
Regulatory T cellFOXP3, IL2RACTLA4, TIGIT
B cellCD19, MS4A1, CD79AIGHG1, IGHM
Plasma cellSDC1 (CD138), XBP1IGHG1, MZB1
M1 MacrophageCD68, NOS2, TNFIL1B, CXCL10
M2 MacrophageCD68, CD163, MRC1ARG1, IL10
Dendritic cellITGAX (CD11c), HLA-DRACD80, CD86
NK cellNCAM1 (CD56), NKG7GNLY, KLRD1
NeutrophilFCGR3B, CXCR2S100A8, S100A9
Mast cellKIT, TPSAB1CPA3, HDC
细胞类型关键标记扩展标记
CD8+ T细胞CD8A、CD8BGZMA、GZMB、PRF1、IFNG
CD4+ T细胞CD4IL2、IL4、IL17A、FOXP3(Treg)
调节性T细胞FOXP3、IL2RACTLA4、TIGIT
B细胞CD19、MS4A1、CD79AIGHG1、IGHM
浆细胞SDC1(CD138)、XBP1IGHG1、MZB1
M1巨噬细胞CD68、NOS2、TNFIL1B、CXCL10
M2巨噬细胞CD68、CD163、MRC1ARG1、IL10
树突状细胞ITGAX(CD11c)、HLA-DRACD80、CD86
NK细胞NCAM1(CD56)、NKG7GNLY、KLRD1
中性粒细胞FCGR3B、CXCR2S100A8、S100A9
肥大细胞KIT、TPSAB1CPA3、HDC

Immune Checkpoint Reference

免疫检查点参考

CheckpointGeneLigandTherapeutic Antibody
PD-1/PD-L1PDCD1/CD274CD274, PDCD1LG2Pembrolizumab, Nivolumab, Atezolizumab
CTLA-4CTLA4CD80, CD86Ipilimumab
TIM-3HAVCR2LGALS9Sabatolimab
LAG-3LAG3HLA class IIRelatlimab
TIGITTIGITPVR, PVRL2Tiragolumab
VISTAVSIRPSGL1-
检查点基因配体治疗性抗体
PD-1/PD-L1PDCD1/CD274CD274、PDCD1LG2Pembrolizumab、Nivolumab、Atezolizumab
CTLA-4CTLA4CD80、CD86Ipilimumab
TIM-3HAVCR2LGALS9Sabatolimab
LAG-3LAG3HLA II类分子Relatlimab
TIGITTIGITPVR、PVRL2Tiragolumab
VISTAVSIRPSGL1-

Workflow

工作流程

  1. Identify immune-related SVGs from marker reference
  2. Classify immune cell types present per spatial domain
  3. Check immune checkpoint expression
  4. Assess immune infiltration patterns:
    • Hot (T cell infiltrated) vs Cold (immune desert) vs Excluded
  5. Identify potential immunotherapy targets
  6. Check for tertiary lymphoid structures (B cell + T cell clusters)

  1. 从标记参考中识别免疫相关SVG
  2. 分类每个空间区域中存在的免疫细胞类型
  3. 检查免疫检查点表达
  4. 评估免疫浸润模式:
    • 热区(T细胞浸润)、冷区(免疫荒漠)、排斥区
  5. 识别潜在的免疫治疗靶点
  6. 检查三级淋巴结构(B细胞+T细胞簇)

Phase 8: Literature & Validation Context

阶段8:文献与验证背景

Objective: Provide literature evidence for spatial findings and suggest validation experiments.
目标: 为空间发现提供文献证据,建议实验验证方法。

Tools Used

使用的工具

PubMed_search_articles (literature search):
  • Input:
    query
    (string),
    max_results
    (int)
  • Output: List of
    [{pmid, title, authors, journal, pub_date, doi}]
  • Use: Find published evidence for spatial patterns
openalex_literature_search (broader literature):
  • Input:
    query
    (string),
    per_page
    (int)
  • Output: List of works with titles, DOIs, abstracts
  • Use: Complement PubMed with preprints and broader coverage
PubMed_search_articles(文献检索):
  • 输入:
    query
    (字符串),
    max_results
    (整数)
  • 输出:
    [{pmid, title, authors, journal, pub_date, doi}]
    列表
  • 用途: 查找空间模式的已发表证据
openalex_literature_search(更广泛的文献):
  • 输入:
    query
    (字符串),
    per_page
    (整数)
  • 输出: 包含标题、DOI、摘要的文献列表
  • 用途: 用预印本和更广泛的覆盖范围补充PubMed

Literature Search Strategy

文献检索策略

  1. Tissue + spatial:
    "{tissue} spatial transcriptomics"
    - e.g., "liver spatial transcriptomics"
  2. Disease + spatial:
    "{disease} spatial omics"
    - e.g., "breast cancer spatial transcriptomics"
  3. Gene + tissue:
    "{top_gene} {tissue} expression"
    for key SVGs
  4. Zonation (if relevant):
    "{tissue} zonation gene expression"
  5. Technology:
    "{technology} {tissue}"
    - e.g., "Visium breast cancer"
  1. 组织+空间:
    "{tissue} spatial transcriptomics"
    - 例如"liver spatial transcriptomics"
  2. 疾病+空间:
    "{disease} spatial omics"
    - 例如"breast cancer spatial transcriptomics"
  3. 基因+组织: 针对关键SVG使用
    "{top_gene} {tissue} expression"
  4. 分区(如适用):
    "{tissue} zonation gene expression"
  5. 技术:
    "{technology} {tissue}"
    - 例如"Visium breast cancer"

Validation Recommendations Template

验证建议模板

PriorityTargetMethodRationaleFeasibility
HighKey SVGsmFISH / RNAscopeValidate spatial pattern at single-molecule levelMedium
HighDruggable targetIHC on serial sectionsConfirm protein expression in spatial domainHigh
HighLigand-receptor pairProximity ligation assay (PLA)Confirm physical interaction at tissue levelMedium
MediumDomain markersMultiplexed IF (CODEX/IBEX)Validate multiple markers simultaneouslyLow-Medium
MediumPathwaySpatial metabolomics (MALDI/DESI)Confirm metabolic pathway activityLow
LowNovel interactionCo-culture + conditioned mediaFunctional validation of predicted interactionMedium
优先级靶点方法理由可行性
关键SVGsmFISH / RNAscope在单分子水平验证空间模式中等
可成药靶点连续切片IHC确认蛋白质在空间区域中的表达
配体-受体对邻近连接实验(PLA)在组织水平确认物理互作中等
区域标记多重免疫荧光(CODEX/IBEX)同时验证多个标记低-中等
通路空间代谢组学(MALDI/DESI)确认代谢通路活性
新型互作共培养+条件培养基功能验证预测的互作中等

Workflow

工作流程

  1. Search PubMed for tissue + disease + spatial transcriptomics
  2. Search for known spatial patterns in the tissue type
  3. Cross-reference findings with published spatial atlas data
  4. Generate validation recommendations based on:
    • Novelty of finding (novel patterns need more validation)
    • Clinical relevance (druggable targets prioritized)
    • Technical feasibility
  5. Cite relevant methodology papers for each validation approach

  1. 检索组织+疾病+空间转录组相关的PubMed文献
  2. 检索该组织类型的已知空间模式
  3. 将发现与已发表的空间图谱数据交叉参考
  4. 根据以下因素生成验证建议:
    • 发现的新颖性(新型模式需要更多验证)
    • 临床相关性(可成药靶点优先)
    • 技术可行性
  5. 为每种验证方法引用相关的方法学文献

Tool Parameter Reference (CRITICAL)

工具参数参考(至关重要)

Verified Parameter Names

已验证的参数名称

ToolParameterCORRECTCommon MISTAKENotes
MyGene_query_genes
query
query
q
Filter results by
symbol
field
STRING_functional_enrichment
identifiers
protein_ids
(array)
identifiers
Also needs
species=9606
STRING_get_interaction_partners
identifiers
protein_ids
(array)
identifiers
limit
,
confidence_score
optional
ReactomeAnalysis_pathway_enrichment
genes
identifiers
(string)
ArraySPACE-SEPARATED string, NOT array
HPA_get_subcellular_location
gene
gene_name
ensembl_id
Uses gene symbol
HPA_get_cancer_prognostics_by_gene
gene
ensembl_id
gene_name
Uses Ensembl ID, NOT symbol
HPA_get_rna_expression_by_source
params
gene_name
,
source_type
,
source_name
-ALL 3 required
HPA_get_rna_expression_in_specific_tissues
gene
ensembl_id
gene_name
Uses Ensembl ID
OpenTargets_get_target_tractability_by_ensemblID
target
ensemblId
ensemblID
camelCase
OpenTargets_get_associated_drugs_by_target_ensemblID
target
ensemblId
,
size
-Both REQUIRED
OpenTargets_get_associated_targets_by_disease_efoId
disease
efoId
diseaseId
Returns {data: {disease: {associatedTargets}}}
DGIdb_get_gene_druggability
genes
genes
(array)
gene_name
Array of strings
DGIdb_get_drug_gene_interactions
genes
genes
(array)
gene_name
Array of strings
clinical_trials_search
action
action='search_studies'
Missing action
action
is REQUIRED
ensembl_lookup_gene
species
species='homo_sapiens'
No speciesREQUIRED parameter
GTEx toolsoperation
operation
(SOAP)
MissingAll GTEx tools need
operation
parameter
HPA_get_comprehensive_gene_details_by_ensembl_id
all paramsALL 5 required:
ensembl_id
,
include_isoforms
,
include_images
,
include_antibodies
,
include_expression
Missing booleansSet booleans to False except expression
GTEx toolsgencode
gencode_id
(array)
gene_id
Requires versioned GENCODE ID
工具参数正确名称常见错误说明
MyGene_query_genes
query
query
q
symbol
字段过滤结果
STRING_functional_enrichment
identifiers
protein_ids
(数组)
identifiers
还需要
species=9606
STRING_get_interaction_partners
identifiers
protein_ids
(数组)
identifiers
limit
confidence_score
为可选参数
ReactomeAnalysis_pathway_enrichment
genes
identifiers
(字符串)
数组空格分隔的字符串,而非数组
HPA_get_subcellular_location
gene
gene_name
ensembl_id
使用基因符号
HPA_get_cancer_prognostics_by_gene
gene
ensembl_id
gene_name
使用Ensembl ID,而非符号
HPA_get_rna_expression_by_source
params
gene_name
,
source_type
,
source_name
-所有3个参数必填
HPA_get_rna_expression_in_specific_tissues
gene
ensembl_id
gene_name
使用Ensembl ID
OpenTargets_get_target_tractability_by_ensemblID
target
ensemblId
ensemblID
小驼峰命名
OpenTargets_get_associated_drugs_by_target_ensemblID
target
ensemblId
,
size
-两者均为必填项
OpenTargets_get_associated_targets_by_disease_efoId
disease
efoId
diseaseId
返回
{data: {disease: {associatedTargets}}}
DGIdb_get_gene_druggability
genes
genes
(数组)
gene_name
字符串数组
DGIdb_get_drug_gene_interactions
genes
genes
(数组)
gene_name
字符串数组
clinical_trials_search
action
action='search_studies'
缺少action
action
为必填项
ensembl_lookup_gene
species
species='homo_sapiens'
无species参数必填参数
GTEx工具operation
operation
(SOAP)
缺少所有GTEx工具都需要
operation
参数
HPA_get_comprehensive_gene_details_by_ensembl_id
所有参数5个参数均必填:
ensembl_id
,
include_isoforms
,
include_images
,
include_antibodies
,
include_expression
缺少布尔值参数除expression外,其他布尔值设为False
GTEx工具gencode
gencode_id
(数组)
gene_id
需要带版本的GENCODE ID

Response Format Reference

响应格式参考

ToolResponse FormatKey Fields
STRING_functional_enrichment
{status, data: [{category, term, description, p_value, fdr, inputGenes}]}
Filter by FDR < 0.05
ReactomeAnalysis_pathway_enrichment
{data: {pathways: [{pathway_id, name, p_value, fdr, entities_found, entities_total}]}}
Top 20 returned
STRING_get_interaction_partners
{status, data: [{preferredName_A, preferredName_B, score}]}
Score > 0.7 for high confidence
MyGene_query_genes
{hits: [{_id, symbol, name, ensembl: {gene}, entrezgene}]}
Filter by exact symbol match
HPA_get_subcellular_location
{gene_name, main_locations: [], additional_locations: [], location_summary}
Direct dict response
OpenTargets_get_target_tractability_by_ensemblID
{data: {target: {id, tractability: [{label, modality, value}]}}}
Check value=true
DGIdb_get_gene_druggability
{data: {genes: {nodes: [{name, geneCategories: [{name}]}]}}}
GraphQL response
PubMed_search_articles
Plain list of
[{pmid, title, authors, journal, pub_date}]
No data wrapper
clinical_trials_search
{total_count, studies: [{nctId, title, status, conditions}]}
total_count can be None

工具响应格式关键字段
STRING_functional_enrichment
{status, data: [{category, term, description, p_value, fdr, inputGenes}]}
按FDR < 0.05过滤
ReactomeAnalysis_pathway_enrichment
{data: {pathways: [{pathway_id, name, p_value, fdr, entities_found, entities_total}]}}
返回排名前20的结果
STRING_get_interaction_partners
{status, data: [{preferredName_A, preferredName_B, score}]}
评分>0.7为高置信度
MyGene_query_genes
{hits: [{_id, symbol, name, ensembl: {gene}, entrezgene}]}
按精确符号匹配过滤
HPA_get_subcellular_location
{gene_name, main_locations: [], additional_locations: [], location_summary}
直接字典响应
OpenTargets_get_target_tractability_by_ensemblID
{data: {target: {id, tractability: [{label, modality, value}]}}}
检查value=true
DGIdb_get_gene_druggability
{data: {genes: {nodes: [{name, geneCategories: [{name}]}]}}}
GraphQL响应
PubMed_search_articles
[{pmid, title, authors, journal, pub_date}]
纯列表
无数据包装
clinical_trials_search
{total_count, studies: [{nctId, title, status, conditions}]}
total_count可为空

Fallback Strategies

fallback策略

Pathway Enrichment

通路富集

  • Primary: STRING_functional_enrichment (most comprehensive, one call)
  • Fallback: ReactomeAnalysis_pathway_enrichment (Reactome-specific)
  • Default: Individual gene GO annotations (GO_get_annotations_for_gene)
  • 首选:
    STRING_functional_enrichment
    (最全面,一次调用)
  • 备选:
    ReactomeAnalysis_pathway_enrichment
    (Reactome特异性)
  • 默认: 单个基因GO注释(
    GO_get_annotations_for_gene

Tissue Expression

组织表达

  • Primary: HPA_get_rna_expression_by_source
  • Fallback: HPA_get_comprehensive_gene_details_by_ensembl_id
  • Default: Note "tissue expression data unavailable"
  • 首选:
    HPA_get_rna_expression_by_source
  • 备选:
    HPA_get_comprehensive_gene_details_by_ensembl_id
  • 默认: 标注“组织表达数据不可用”

Disease Association

疾病关联

  • Primary: OpenTargets_get_associated_targets_by_disease_efoId
  • Fallback: OpenTargets_target_disease_evidence (per gene)
  • Default: Skip disease section if no disease context
  • 首选:
    OpenTargets_get_associated_targets_by_disease_efoId
  • 备选:
    OpenTargets_target_disease_evidence
    (每个基因)
  • 默认: 若无疾病背景,跳过疾病章节

Drug Information

药物信息

  • Primary: OpenTargets_get_associated_drugs_by_target_ensemblID
  • Fallback: DGIdb_get_drug_gene_interactions
  • Default: Note "no approved drugs identified"
  • 首选:
    OpenTargets_get_associated_drugs_by_target_ensemblID
  • 备选:
    DGIdb_get_drug_gene_interactions
  • 默认: 标注“未识别到已获批药物”

Literature

文献

  • Primary: PubMed_search_articles
  • Fallback: openalex_literature_search
  • Default: Note "no spatial-specific literature found"

  • 首选:
    PubMed_search_articles
  • 备选:
    openalex_literature_search
  • 默认: 标注“未找到空间特异性文献”

Common Use Cases

常见用例

Use Case 1: Cancer Spatial Heterogeneity

用例1:癌症空间异质性

Input: Visium data from breast cancer with 5 spatial domains (tumor core, tumor margin, stroma, immune infiltrate, normal tissue) and 200 SVGs.
Analysis focus:
  • Tumor-specific pathways (proliferation, DNA repair)
  • Immune infiltration patterns (hot vs cold)
  • Tumor-stroma interactions (CAF signaling)
  • Druggable targets in tumor core
  • Immune checkpoint expression patterns
  • Prognostic genes per domain
输入: 乳腺癌Visium数据,包含5个空间区域(肿瘤核心、肿瘤边缘、基质、免疫浸润区、正常组织)和200个SVG。
分析重点:
  • 肿瘤特异性通路(增殖、DNA修复)
  • 免疫浸润模式(热区vs冷区)
  • 肿瘤-基质互作(CAF信号)
  • 肿瘤核心区的可成药靶点
  • 免疫检查点表达模式
  • 各区域的预后基因

Use Case 2: Brain Tissue Zonation

用例2:脑组织分区

Input: MERFISH data from hippocampus with cell-type specific genes and neuronal subtype markers.
Analysis focus:
  • Neuronal subtype characterization
  • Synaptic signaling pathways
  • Neurotransmitter receptor distribution
  • Known hippocampal zonation patterns (CA1, CA3, DG)
  • Neurodegenerative disease gene overlap
输入: 海马体MERFISH数据,包含细胞类型特异性基因和神经元亚型标记。
分析重点:
  • 神经元亚型表征
  • 突触信号通路
  • 神经递质受体分布
  • 已知海马体分区模式(CA1、CA3、DG)
  • 神经退行性疾病基因交集

Use Case 3: Liver Metabolic Zonation

用例3:肝脏代谢分区

Input: Spatial transcriptomics of liver with periportal vs pericentral gene gradients.
Analysis focus:
  • Metabolic enzyme distribution (CYP450, gluconeogenesis, lipogenesis)
  • Wnt signaling gradient (known zonation regulator)
  • Oxygen gradient-responsive genes
  • Drug metabolism enzyme spatial patterns
  • Liver disease gene overlap
输入: 肝脏空间转录组数据,包含门脉周与中央静脉周基因梯度。
分析重点:
  • 代谢酶分布(CYP450、糖异生、脂肪生成)
  • Wnt信号梯度(已知分区调控因子)
  • 氧梯度响应基因
  • 药物代谢酶空间模式
  • 肝脏疾病基因交集

Use Case 4: Tumor-Immune Interface

用例4:肿瘤-免疫界面

Input: DBiTplus data from melanoma with spatial protein + RNA data showing tumor-immune boundary.
Analysis focus:
  • Immune cell composition at boundary
  • Checkpoint ligand-receptor pairs
  • Immune exclusion mechanisms
  • Immunotherapy target identification
  • Multi-modal (RNA + protein) concordance
输入: 黑色素瘤DBiTplus数据,包含显示肿瘤-免疫边界的空间蛋白质+RNA数据。
分析重点:
  • 边界处的免疫细胞组成
  • 检查点配体-受体对
  • 免疫排斥机制
  • 免疫治疗靶点识别
  • 多模态(RNA+蛋白质)一致性

Use Case 5: Developmental Spatial Patterns

用例5:发育空间模式

Input: Spatial transcriptomics of embryonic tissue with developmental patterning genes.
Analysis focus:
  • Morphogen gradients (Wnt, BMP, FGF, SHH)
  • Transcription factor spatial patterns
  • Cell fate determination genes
  • Developmental signaling pathways
  • Comparison to adult tissue patterns
输入: 胚胎组织空间转录组数据,包含发育模式基因。
分析重点:
  • 形态发生素梯度(Wnt、BMP、FGF、SHH)
  • 转录因子空间模式
  • 细胞命运决定基因
  • 发育信号通路
  • 与成年组织模式的比较

Use Case 6: Disease Progression Mapping

用例6:疾病进展图谱

Input: Spatial data from neurodegenerative tissue showing disease gradient from affected to unaffected regions.
Analysis focus:
  • Disease gene expression gradient
  • Inflammatory response spatial pattern
  • Neuronal loss markers
  • Glial activation patterns
  • Therapeutic window identification

输入: 神经退行性组织空间数据,显示从受影响区域到未受影响区域的疾病梯度。
分析重点:
  • 疾病基因表达梯度
  • 炎症反应空间模式
  • 神经元丢失标记
  • 胶质细胞激活模式
  • 治疗窗口识别

Limitations & Known Issues

局限性与已知问题

Database-Specific

数据库特异性

  • Enrichment:
    enrichr_gene_enrichment_analysis
    returns connectivity graph (107MB), NOT standard enrichment. Use
    STRING_functional_enrichment
    instead
  • GTEx: SOAP-style tools requiring
    operation
    parameter; needs versioned GENCODE IDs (e.g.,
    ENSG00000141510.16
    )
  • HPA: Some tools use
    gene_name
    , others use
    ensembl_id
    - check parameter reference
  • OpenTargets: Disease IDs use underscore format (
    MONDO_0007254
    ), not colon
  • cBioPortal_get_cancer_studies: BROKEN - has literal
    {limit}
    in URL causing 400 error
  • 富集:
    enrichr_gene_enrichment_analysis
    返回连接图(107MB),而非标准富集结果。请使用
    STRING_functional_enrichment
    替代
  • GTEx: 需要
    operation
    参数的SOAP风格工具;需要带版本的GENCODE ID(例如
    ENSG00000141510.16
  • HPA: 部分工具使用
    gene_name
    ,其他使用
    ensembl_id
    - 请查阅参数参考
  • OpenTargets: 疾病ID使用下划线格式(
    MONDO_0007254
    ),而非冒号
  • cBioPortal_get_cancer_studies: 已损坏 - URL中包含字面量
    {limit}
    导致400错误

Conceptual

概念性

  • No raw spatial data processing: This skill analyzes gene LISTS, not raw spatial matrices (Seurat/Scanpy/squidpy handle raw data)
  • No spatial statistics: Cannot perform Moran's I, spatial autocorrelation, or variogram analysis
  • No image analysis: Cannot process H&E or fluorescence images
  • No deconvolution: Cannot perform cell type deconvolution (use BayesSpace, cell2location, RCTD externally)
  • Ligand-receptor inference: Based on gene co-expression + known pairs, not spatial proximity statistics (use CellChat, NicheNet, COMMOT externally)
  • 无原始空间数据处理: 该技能分析基因列表,而非原始空间矩阵(Seurat/Scanpy/squidpy处理原始数据)
  • 无空间统计: 无法执行Moran's I、空间自相关或变异函数分析
  • 无图像分析: 无法处理H&E或荧光图像
  • 无去卷积: 无法执行细胞类型去卷积(请外部使用BayesSpace、cell2location、RCTD)
  • 配体-受体推断: 基于基因共表达+已知对,而非空间邻近统计(请外部使用CellChat、NicheNet、COMMOT)

Technical

技术性

  • Large gene lists: >200 genes may slow STRING queries; batch or sample
  • Response format variability: Always check both dict and list response types
  • Rate limits: STRING and OpenTargets may throttle frequent requests

  • 大基因列表: >200个基因可能减慢STRING查询;请批量处理或抽样
  • 响应格式多变: 始终检查字典和列表两种响应类型
  • 速率限制: STRING和OpenTargets可能对频繁请求进行限流

Summary

总结

Spatial Multi-Omics Analysis skill provides:
  1. Gene characterization (ID resolution, function, localization, tissue expression)
  2. Pathway & functional enrichment (STRING, Reactome, GO, KEGG)
  3. Spatial domain characterization (per-domain and cross-domain comparison)
  4. Cell-cell interaction inference (PPI, ligand-receptor, signaling pathways)
  5. Disease & therapeutic context (disease genes, druggable targets, clinical trials)
  6. Multi-modal integration (RNA-protein concordance, metabolic pathways)
  7. Immune microenvironment characterization (cell types, checkpoints, immunotherapy)
  8. Literature context & validation recommendations
Outputs: Comprehensive markdown report with Spatial Omics Integration Score (0-100) Best for: Biological interpretation of spatial omics experiments (post-processing after spatial data analysis tools) Uses: 70+ ToolUniverse tools across 9 analysis phases Time: ~10-20 minutes depending on gene list size and analysis scope
空间多组学分析技能提供:
  1. 基因表征(ID解析、功能、定位、组织表达)
  2. 通路与功能富集(STRING、Reactome、GO、KEGG)
  3. 空间区域表征(单区域与跨区域比较)
  4. 细胞间互作推断(PPI、配体-受体、信号通路)
  5. 疾病与治疗背景(疾病基因、可成药靶点、临床试验)
  6. 多模态整合(RNA-蛋白质一致性、代谢通路)
  7. 免疫微环境表征(细胞类型、检查点、免疫治疗)
  8. 文献背景与验证建议
输出: 包含空间多组学整合评分(0-100)的全面Markdown报告 最佳适用场景: 空间多组学实验的生物学解读(空间数据分析工具后处理) 使用工具: 9个分析阶段中使用70余种ToolUniverse工具 耗时: 约10-20分钟,取决于基因列表大小与分析范围