tooluniverse-metabolomics

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Metabolomics Research

代谢组学研究

Comprehensive metabolomics research skill that identifies metabolites, analyzes studies, and searches metabolomics databases. Generates structured research reports with annotated metabolite information, study details, and database statistics.
一款综合性代谢组学研究技能,可识别代谢物、分析研究并搜索代谢组学数据库。生成包含注释化代谢物信息、研究详情及数据库统计数据的结构化研究报告。

Use Case

使用场景

Use this skill when asked to:
  • Identify or annotate metabolites (HMDB IDs, chemical properties, pathways)
  • Retrieve metabolomics study information from MetaboLights or Metabolomics Workbench
  • Search for metabolomics studies by keywords or disease
  • Analyze metabolite profiles or datasets
  • Generate comprehensive metabolomics research reports
Example queries:
  • "What is the HMDB ID and pathway information for glucose?"
  • "Get study details for MTBLS1"
  • "Find metabolomics studies related to diabetes"
  • "Analyze these metabolites: glucose, lactate, pyruvate"
在以下场景中使用本技能:
  • 识别或注释代谢物(包括HMDB ID、化学性质、代谢通路)
  • 从MetaboLights或Metabolomics Workbench获取代谢组学研究信息
  • 按关键词或疾病搜索代谢组学研究
  • 分析代谢物谱或数据集
  • 生成综合性代谢组学研究报告
示例查询:
  • "葡萄糖的HMDB ID和通路信息是什么?"
  • "获取MTBLS1的研究详情"
  • "查找与糖尿病相关的代谢组学研究"
  • "分析这些代谢物:葡萄糖、乳酸、丙酮酸"

Databases Covered

覆盖的数据库

Primary metabolite databases:
  • HMDB (Human Metabolome Database): 220,000+ metabolites with structures, pathways, and biological roles
  • MetaboLights: Public metabolomics repository with thousands of studies
  • Metabolomics Workbench: NIH Common Fund metabolomics data repository
  • PubChem: Chemical properties and bioactivity data (fallback)
主要代谢物数据库:
  • HMDB(人类代谢组数据库):包含22万+种代谢物,提供结构、通路及生物学功能信息
  • MetaboLights:公开的代谢组学研究存储库,包含数千项研究
  • Metabolomics Workbench:美国国立卫生研究院共同基金支持的代谢组学数据存储库
  • PubChem:化学性质与生物活性数据(备选数据源)

Research Workflow

研究工作流

The skill executes a 4-phase analysis pipeline:
本技能执行一个四阶段分析流程:

Phase 1: Metabolite Identification & Annotation

阶段1:代谢物识别与注释

For each metabolite in the input list:
  1. Search HMDB by metabolite name
  2. Retrieve HMDB ID, chemical formula, molecular weight
  3. Get detailed metabolite information (description, pathways)
  4. Fallback to PubChem for CID and chemical properties if HMDB unavailable
针对输入列表中的每个代谢物:
  1. 按代谢物名称搜索HMDB
  2. 获取HMDB ID、化学式、分子量
  3. 获取详细代谢物信息(描述、通路)
  4. 若HMDB无结果,从PubChem获取CID及化学性质作为备选

Phase 2: Study Details Retrieval

阶段2:研究详情检索

For provided study IDs:
  1. Detect database type (MTBLS = MetaboLights, ST = Metabolomics Workbench)
  2. Retrieve study metadata (title, description, organism, status)
  3. Extract experimental design and data availability
针对提供的研究ID:
  1. 检测数据库类型(MTBLS对应MetaboLights,ST对应Metabolomics Workbench)
  2. 获取研究元数据(标题、描述、生物、状态)
  3. 提取实验设计与数据可用性信息

Phase 3: Study Search

阶段3:研究搜索

For keyword searches:
  1. Search MetaboLights studies by query term
  2. Return matching study IDs with preview information
  3. Report total number of results
针对关键词搜索:
  1. 按查询词搜索MetaboLights研究
  2. 返回匹配的研究ID及预览信息
  3. 报告结果总数

Phase 4: Database Overview

阶段4:数据库概览

Always included in reports:
  1. Sample recent studies from MetaboLights
  2. Database statistics and availability
  3. Integration information for all databases
报告中始终包含以下内容:
  1. MetaboLights的近期研究样本
  2. 数据库统计数据与可用性
  3. 所有数据库的整合信息

Usage Patterns

使用模式

Pattern 1: Metabolite Identification

模式1:代谢物识别

Input:
  • Metabolite list: ["glucose", "lactate", "pyruvate"]
Output report includes:
  • HMDB IDs for each metabolite
  • Chemical formulas and molecular weights
  • Biological pathways
  • PubChem CIDs
  • SMILES representations
输入:
  • 代谢物列表:["glucose", "lactate", "pyruvate"]
输出报告包含:
  • 每个代谢物的HMDB ID
  • 化学式与分子量
  • 生物学通路
  • PubChem CID
  • SMILES表达式

Pattern 2: Study Retrieval

模式2:研究检索

Input:
  • Study ID: "MTBLS1" or "ST000001"
Output report includes:
  • Study title and description
  • Organism information
  • Study status and release date
  • Data availability
输入:
  • 研究ID:"MTBLS1" 或 "ST000001"
输出报告包含:
  • 研究标题与描述
  • 生物信息
  • 研究状态与发布日期
  • 数据可用性

Pattern 3: Study Search

模式3:研究搜索

Input:
  • Search query: "diabetes"
  • Optional organism filter
Output report includes:
  • Matching study IDs
  • Study titles and previews
  • Total result count
输入:
  • 搜索查询词:"diabetes"
  • 可选生物过滤条件
输出报告包含:
  • 匹配的研究ID
  • 研究标题与预览
  • 结果总数

Pattern 4: Comprehensive Analysis

模式4:综合分析

Input:
  • Metabolite list: ["glucose", "pyruvate"]
  • Study ID: "MTBLS1"
  • Search query: "diabetes"
Output report includes:
  • All phases combined (identification, study details, search results, overview)
  • Cross-referenced information
  • Complete metabolomics research summary
输入:
  • 代谢物列表:["glucose", "pyruvate"]
  • 研究ID:"MTBLS1"
  • 搜索查询词:"diabetes"
输出报告包含:
  • 所有阶段内容整合(识别、研究详情、搜索结果、概览)
  • 交叉引用信息
  • 完整的代谢组学研究摘要

Input Parameters

输入参数

metabolite_list (optional)

metabolite_list(可选)

List of metabolite names to identify and annotate.
  • Format: List of strings
  • Examples:
    ["glucose"]
    ,
    ["lactate", "pyruvate", "acetate"]
  • Note: Common names accepted; HMDB will find standard identifiers
需要识别与注释的代谢物名称列表。
  • 格式:字符串列表
  • 示例
    ["glucose"]
    ,
    ["lactate", "pyruvate", "acetate"]
  • 说明:接受常用名称;HMDB会自动匹配标准标识符

study_id (optional)

study_id(可选)

MetaboLights or Metabolomics Workbench study identifier.
  • Format: String starting with "MTBLS" or "ST"
  • Examples:
    "MTBLS1"
    ,
    "ST000001"
  • Note: Database auto-detected from prefix
MetaboLights或Metabolomics Workbench的研究标识符。
  • 格式:以"MTBLS"或"ST"开头的字符串
  • 示例
    "MTBLS1"
    ,
    "ST000001"
  • 说明:会根据前缀自动检测数据库

search_query (optional)

search_query(可选)

Keyword to search metabolomics studies.
  • Format: String (disease, compound, organism, method)
  • Examples:
    "diabetes"
    ,
    "glucose metabolism"
    ,
    "LC-MS"
用于搜索代谢组学研究的关键词。
  • 格式:字符串(疾病、化合物、生物、方法)
  • 示例
    "diabetes"
    ,
    "glucose metabolism"
    ,
    "LC-MS"

organism (optional)

organism(可选)

Target organism for study filtering.
  • Format: String (scientific name)
  • Default:
    "Homo sapiens"
  • Examples:
    "Mus musculus"
    ,
    "Saccharomyces cerevisiae"
用于过滤研究的目标生物。
  • 格式:字符串(学名)
  • 默认值
    "Homo sapiens"
  • 示例
    "Mus musculus"
    ,
    "Saccharomyces cerevisiae"

output_file (optional)

output_file(可选)

Path for the generated markdown report.
  • Format: String (filename with .md extension)
  • Default: Auto-generated timestamp-based filename
  • Examples:
    "my_analysis.md"
    ,
    "metabolomics_report.md"
生成的markdown报告路径。
  • 格式:带.md扩展名的字符串
  • 默认值:自动生成基于时间戳的文件名
  • 示例
    "my_analysis.md"
    ,
    "metabolomics_report.md"

Output Format

输出格式

All analyses generate a structured markdown report with:
Header section:
  • Report title and generation timestamp
  • Input parameters summary (metabolites, study ID, search query, organism)
Phase sections:
  • Clear section headers (## 1. Metabolite Identification, ## 2. Study Details, etc.)
  • Subsections for each metabolite or result
  • Consistent formatting (bold labels, tables for results)
Database overview:
  • Available databases and statistics
  • Recent studies sample
  • Integration information
Error handling:
  • Graceful error messages for unavailable data
  • Fallback strategies documented in output
  • "N/A" for missing fields (not blank)
所有分析都会生成结构化markdown报告,包含:
头部区域:
  • 报告标题与生成时间戳
  • 输入参数摘要(代谢物、研究ID、搜索查询词、生物)
阶段区域:
  • 清晰的章节标题(## 1. 代谢物识别、## 2. 研究详情等)
  • 每个代谢物或结果的子章节
  • 统一格式(粗体标签、结果表格)
数据库概览:
  • 可用数据库与统计数据
  • 近期研究样本
  • 整合信息
错误处理:
  • 针对不可用数据的友好错误提示
  • 输出中记录备选策略
  • 缺失字段显示"N/A"(不留空)

Implementation Notes

实现说明

SOAP Tool Handling

SOAP工具处理

HMDB tools are SOAP-based and require special parameter handling:
  • HMDB_search
    : Requires
    operation="search"
    parameter
  • HMDB_get_metabolite
    : Requires
    operation="get_metabolite"
    parameter
  • Do not use
    endpoint
    or
    method
    parameters (not applicable to SOAP)
HMDB工具基于SOAP协议,需要特殊参数处理:
  • HMDB_search
    :需要
    operation="search"
    参数
  • HMDB_get_metabolite
    :需要
    operation="get_metabolite"
    参数
  • 不要使用
    endpoint
    method
    参数(不适用于SOAP协议)

Response Format Variations

响应格式差异

Tools return different response formats - handle all three:
  1. Standard format:
    {status: "success", data: [...], metadata: {...}}
  2. Direct list:
    [...]
    (e.g., metabolights_list_studies)
  3. Direct dict:
    {field1: ..., field2: ...}
    (e.g., some detail endpoints)
Always check response type with
isinstance()
before accessing fields.
工具返回不同的响应格式,需处理以下三种:
  1. 标准格式
    {status: "success", data: [...], metadata: {...}}
  2. 直接列表
    [...]
    (例如metabolights_list_studies)
  3. 直接字典
    {field1: ..., field2: ...}
    (例如部分详情端点)
访问字段前需始终用
isinstance()
检查响应类型。

Fallback Strategy

备选策略

Follow this hierarchy for robustness:
  1. Primary source: Try main database first (HMDB for metabolites, MetaboLights for studies)
  2. Fallback source: Use alternative database if primary fails (PubChem for chemical properties)
  3. Default behavior: Show error message with context, continue with remaining phases
遵循以下层级确保鲁棒性:
  1. 主数据源:优先尝试主数据库(代谢物用HMDB,研究用MetaboLights)
  2. 备选数据源:主数据库失败时使用备选数据库(化学性质用PubChem)
  3. 默认行为:显示带上下文的错误提示,继续执行剩余阶段

Progressive Report Writing

渐进式报告写入

Write report incrementally to avoid memory issues:
  1. Create output file early in pipeline
  2. Append sections as each phase completes
  3. Flush to disk regularly for long analyses
  4. Return file path for user access
采用增量写入报告以避免内存问题:
  1. 在流程早期创建输出文件
  2. 完成每个阶段后追加章节
  3. 长分析时定期刷新到磁盘
  4. 返回文件路径供用户访问

Tool Discovery

工具发现

The skill automatically discovers and uses these tools from ToolUniverse:
HMDB Tools:
  • HMDB_search
    : Search metabolites by name
  • HMDB_get_metabolite
    : Get detailed metabolite information
MetaboLights Tools:
  • metabolights_list_studies
    : List available studies
  • metabolights_search_studies
    : Search studies by keyword
  • metabolights_get_study
    : Get study details by ID
Metabolomics Workbench Tools:
  • MetabolomicsWorkbench_get_study
    : Get study information
  • MetabolomicsWorkbench_search_compound_by_name
    : Search compounds
PubChem Tools:
  • PubChem_get_CID_by_compound_name
    : Get PubChem CID
  • PubChem_get_compound_properties_by_CID
    : Get chemical properties
No manual tool configuration required - all tools loaded automatically.
本技能会自动从ToolUniverse发现并使用以下工具:
HMDB工具:
  • HMDB_search
    :按名称搜索代谢物
  • HMDB_get_metabolite
    :获取详细代谢物信息
MetaboLights工具:
  • metabolights_list_studies
    :列出可用研究
  • metabolights_search_studies
    :按关键词搜索研究
  • metabolights_get_study
    :按ID获取研究详情
Metabolomics Workbench工具:
  • MetabolomicsWorkbench_get_study
    :获取研究信息
  • MetabolomicsWorkbench_search_compound_by_name
    :搜索化合物
PubChem工具:
  • PubChem_get_CID_by_compound_name
    :获取PubChem CID
  • PubChem_get_compound_properties_by_CID
    :获取化学性质
无需手动配置工具 - 所有工具会自动加载。

Common Issues

常见问题

Issue: HMDB returns "Error querying HMDB: 0"

问题:HMDB返回"Error querying HMDB: 0"

Cause: HMDB search returned empty results or index error accessing first result Solution: This is expected for uncommon metabolites; PubChem fallback will be attempted
原因:HMDB搜索无结果或访问第一个结果时出现索引错误 解决方案:这对不常见代谢物是正常情况;会自动尝试PubChem备选

Issue: Study details show "N/A" for all fields

问题:研究详情所有字段显示"N/A"

Cause: Study ID not found or API unavailable Solution: Verify study ID format (MTBLS* or ST*), check if study is public
原因:研究ID不存在或API不可用 解决方案:验证研究ID格式(MTBLS或ST),检查研究是否公开

Issue: Tool not found errors

问题:工具未找到错误

Cause: Missing API keys for some databases Solution: Check
.env.template
, add required API keys to
.env
file (most metabolomics tools work without keys)
原因:部分数据库缺少API密钥 解决方案:查看
.env.template
,将所需API密钥添加到
.env
文件(大多数代谢组学工具无需密钥即可使用)

Issue: Large metabolite lists cause slow execution

问题:大型代谢物列表导致执行缓慢

Cause: Pipeline queries each metabolite individually Solution: Reports limit to first 10 metabolites; consider batching for >20 metabolites
原因:流程会逐个查询每个代谢物 解决方案:报告默认限制前10个代谢物;超过20个时建议分批处理

Tool Parameter Reference

工具参数参考

HMDB Tools (SOAP)

HMDB工具(SOAP)

ToolRequired ParametersOptional ParametersResponse FormatNotes
HMDB_search
operation="search"
,
query
-
{status, data: []}
SOAP tool - operation required
HMDB_get_metabolite
operation="get_metabolite"
,
hmdb_id
-
{status, data: {}}
SOAP tool - operation required
工具必填参数可选参数响应格式说明
HMDB_search
operation="search"
,
query
-
{status, data: []}
SOAP工具 - 必须指定operation参数
HMDB_get_metabolite
operation="get_metabolite"
,
hmdb_id
-
{status, data: {}}
SOAP工具 - 必须指定operation参数

MetaboLights Tools (REST)

MetaboLights工具(REST)

ToolRequired ParametersOptional ParametersResponse FormatNotes
metabolights_list_studies
-
size
(default: 10)
{status, data: []}
or
[...]
May return direct list
metabolights_search_studies
query
-
{status, data: []}
Returns study IDs
metabolights_get_study
study_id
-
{status, data: {}}
Full study metadata
工具必填参数可选参数响应格式说明
metabolights_list_studies
-
size
(默认:10)
{status, data: []}
[...]
可能返回直接列表
metabolights_search_studies
query
-
{status, data: []}
返回研究ID
metabolights_get_study
study_id
-
{status, data: {}}
完整研究元数据

Metabolomics Workbench Tools (REST)

Metabolomics Workbench工具(REST)

ToolRequired ParametersOptional ParametersResponse FormatNotes
MetabolomicsWorkbench_get_study
study_id
output_item
(default: "summary")
{status, data: {}}
Data may be text
MetabolomicsWorkbench_search_compound_by_name
compound_name
-
{status, data: {}}
Compound information
工具必填参数可选参数响应格式说明
MetabolomicsWorkbench_get_study
study_id
output_item
(默认:"summary")
{status, data: {}}
数据可能为文本格式
MetabolomicsWorkbench_search_compound_by_name
compound_name
-
{status, data: {}}
化合物信息

PubChem Tools (REST)

PubChem工具(REST)

ToolRequired ParametersOptional ParametersResponse FormatNotes
PubChem_get_CID_by_compound_name
compound_name
-
{status, data: {cid}}
Returns CID
PubChem_get_compound_properties_by_CID
cid
-
{status, data: {}}
Chemical properties
Important: All parameter names and requirements apply to both Python SDK and MCP implementations.
工具必填参数可选参数响应格式说明
PubChem_get_CID_by_compound_name
compound_name
-
{status, data: {cid}}
返回CID
PubChem_get_compound_properties_by_CID
cid
-
{status, data: {}}
化学性质
重要提示:所有参数名称及要求适用于Python SDK和MCP实现

Summary

总结

The Metabolomics Research skill provides comprehensive metabolomics analysis through a 4-phase pipeline that:
  1. Identifies metabolites using HMDB (primary) and PubChem (fallback) databases
  2. Retrieves study details from MetaboLights and Metabolomics Workbench repositories
  3. Searches studies by keywords across metabolomics databases
  4. Generates structured reports with all findings in readable markdown format
Key Features:
  • ✅ 100% test coverage with working pipeline
  • ✅ Handles SOAP tools correctly (HMDB requires
    operation
    parameter)
  • ✅ Implements fallback strategies (HMDB → PubChem)
  • ✅ Graceful error handling (continues if one phase fails)
  • ✅ Progressive report writing (memory-efficient)
  • ✅ Implementation-agnostic documentation (works with Python SDK and MCP)
Best for:
  • Metabolite annotation and pathway analysis
  • Study discovery and data retrieval
  • Comprehensive metabolomics research reports
  • Multi-database metabolomics queries
Limitations:
  • HMDB may not have all metabolites (fallback to PubChem)
  • Some studies require authentication or are not public
  • Large metabolite lists (>10) auto-limited in reports
  • API rate limits may affect large-scale queries
代谢组学研究技能通过四阶段流程提供全面的代谢组学分析:
  1. 识别代谢物:使用HMDB(主数据源)和PubChem(备选数据源)
  2. 检索研究详情:从MetaboLights和Metabolomics Workbench存储库获取
  3. 搜索研究:跨代谢组学数据库按关键词搜索
  4. 生成结构化报告:将所有结果以可读的markdown格式呈现
核心特性:
  • ✅ 100%测试覆盖,流程可用
  • ✅ 正确处理SOAP工具(HMDB需要
    operation
    参数)
  • ✅ 实现备选策略(HMDB → PubChem)
  • ✅ 友好的错误处理(某一阶段失败仍可继续)
  • ✅ 渐进式报告写入(内存高效)
  • ✅ 与实现无关的文档(适用于Python SDK和MCP)
适用场景:
  • 代谢物注释与通路分析
  • 研究发现与数据检索
  • 综合性代谢组学研究报告
  • 多数据库代谢组学查询
局限性:
  • HMDB可能不包含所有代谢物(可备选PubChem)
  • 部分研究需要认证或未公开
  • 报告自动限制大型代谢物列表(>10个)
  • API速率限制可能影响大规模查询

Quick Start

快速开始

See
QUICK_START.md
for:
  • Python SDK implementation with code examples
  • MCP integration instructions
  • Step-by-step tutorial for common workflows
  • Advanced usage patterns
查看
QUICK_START.md
获取:
  • 带代码示例的Python SDK实现
  • MCP集成说明
  • 常见工作流分步教程
  • 高级使用模式