tooluniverse-small-molecule-discovery
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSmall Molecule Discovery Skill
小分子发现技能
Systematic small molecule identification, characterization, and sourcing using PubChem, ChEMBL, BindingDB, ADMET-AI, SwissADME, eMolecules, and Enamine. Covers the full pipeline from compound name to structure, activity, ADMET properties, and commercial procurement.
借助PubChem、ChEMBL、BindingDB、ADMET-AI、SwissADME、eMolecules和Enamine实现系统化的小分子鉴定、表征与采购。覆盖从化合物名称到结构、活性、ADMET属性、商业采购的全流程。
Domain Reasoning
领域逻辑说明
Drug-likeness is not a binary property. Lipinski's Rule of 5 was derived from orally administered, passively absorbed drugs and has many well-known exceptions: natural products, macrocycles, PROTACs, and many approved drugs violate one or more rules. The relevant question is not "does this pass Ro5?" but "does this compound's physicochemical profile match the requirements of the target, the intended route of administration, and the therapeutic context?" Focus on the specific requirements, not rigid rules.
类药性并不是二元属性。利平斯基五规则(Lipinski's Rule of 5)衍生于口服给药、被动吸收的药物,存在许多广为人知的例外:天然产物、大环化合物、PROTAC以及大量获批药物都违反一条或多条规则。核心问题不是「这个化合物符合五规则吗?」,而是「该化合物的理化特性是否匹配靶点、预期给药途径和治疗场景的要求?」。请聚焦具体需求,而非拘泥于僵化规则。
LOOK UP DON'T GUESS
请查询不要猜测
- Compound identity (CID, ChEMBL ID, SMILES): call and
PubChem_get_CID_by_compound_name; do not assume IDs from memory.ChEMBL_search_molecules - ADMET properties: run or
SwissADME_calculate_admeon the actual SMILES; do not estimate logP, TPSA, or bioavailability.ADMETAI_predict_* - Binding affinities against a target: query or
ChEMBL_search_activities; never cite IC50 values from memory.BindingDB_get_ligands_by_uniprot - Commercial availability: check or
eMolecules_search; do not assume availability.Enamine_search_catalog
KEY PRINCIPLES:
- Resolve identity first - Always get CID and ChEMBL ID before research
- SMILES required for property prediction - Extract canonical SMILES from PubChem early
- English names in tools - Use IUPAC or common English names; avoid abbreviations in tool calls
- BindingDB is often unavailable - Fall back to ChEMBL activities when BindingDB times out
- eMolecules/Enamine return URLs - These tools generate search URLs, not direct data; note this to user
- 化合物标识(CID、ChEMBL ID、SMILES):调用和
PubChem_get_CID_by_compound_name;不要凭记忆假设ID。ChEMBL_search_molecules - ADMET属性:基于实际SMILES运行或
SwissADME_calculate_adme;不要估算logP、TPSA或生物利用度。ADMETAI_predict_* - 靶点结合亲和力:查询或
ChEMBL_search_activities;绝对不要凭记忆引用IC50值。BindingDB_get_ligands_by_uniprot - 商业可得性:查询或
eMolecules_search;不要假设是否可购买。Enamine_search_catalog
核心原则:
- 优先解决标识问题 - 开展研究前务必先获取CID和ChEMBL ID
- 属性预测需要SMILES - 尽早从PubChem提取标准SMILES
- 工具调用使用英文名称 - 采用IUPAC名或通用英文名;工具调用中避免使用缩写
- BindingDB经常不可用 - 当BindingDB超时,回退使用ChEMBL活性数据
- eMolecules/Enamine返回URL - 这些工具生成搜索链接而非直接数据,请告知用户该特性
COMPUTE, DON'T DESCRIBE
直接计算不要描述
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
当分析需要计算(统计、数据处理、评分、富集分析)时,通过Bash编写并运行Python代码。不要描述你打算做什么——直接执行并返回实际结果。使用ToolUniverse工具检索数据,再通过Python(pandas、scipy、statsmodels、matplotlib)开展分析。
When to Use
适用场景
- "Find information about compound X"
- "What is the drug-likeness of this SMILES?"
- "Show binding affinities for EGFR inhibitors"
- "Search for compounds similar to imatinib"
- "Is this compound commercially available?"
- "What are the ADMET properties of this molecule?"
- "Find ChEMBL activities for target Y"
- "Predict targets for this small molecule"
- 「查找化合物X的相关信息」
- 「这个SMILES对应的类药性如何?」
- 「展示EGFR抑制剂的结合亲和力数据」
- 「搜索与伊马替尼类似的化合物」
- 「这个化合物可以商业采购吗?」
- 「这个分子的ADMET属性是什么?」
- 「查找靶点Y的ChEMBL活性数据」
- 「预测这个小分子的作用靶点」
Key Tools
核心工具
| Tool | Purpose | Key Params |
|---|---|---|
| Name to CID lookup | |
| SMILES to CID lookup | |
| MW, formula, SMILES, InChIKey | |
| Find structurally similar compounds | |
| Substructure search | |
| All names/synonyms | |
| Search ChEMBL by name or ID | |
| Full ChEMBL molecule record | |
| Similarity search in ChEMBL | |
| Binding affinities and assay data | |
| MOA for approved drugs | |
| Find targets by name | |
| All ligands for a target | |
| Physicochemical + ADMET properties | |
| Lipinski, Veber, Egan rules | |
| MW, logP, TPSA, HBD/HBA | |
| Oral bioavailability prediction | |
| Blood-brain barrier permeability | |
| hERG, DILI, mutagenicity | |
| CYP450 inhibition/substrate | |
| Predict protein targets for compound | |
| Find commercially available compounds | |
| Structure-based commercial search | |
| Find vendors for a specific compound | |
| Search Enamine screening library | |
| Search Enamine by structure | |
| List Enamine compound libraries | (none required) |
| 工具 | 用途 | 核心参数 |
|---|---|---|
| 通过名称查询CID | |
| 通过SMILES查询CID | |
| 查询分子量、分子式、SMILES、InChIKey | |
| 查找结构相似的化合物 | |
| 子结构搜索 | |
| 查询所有别名/同义词 | |
| 通过名称或ID搜索ChEMBL | |
| 获取完整ChEMBL分子记录 | |
| ChEMBL内的相似度搜索 | |
| 结合亲和力与实验数据 | |
| 获批药物的作用机制 | |
| 通过名称查找靶点 | |
| 获取靶点对应的所有配体 | |
| 理化属性 + ADMET属性计算 | |
| 利平斯基、Veber、Egan规则校验 | |
| 分子量、logP、TPSA、氢键供体/受体 | |
| 口服生物利用度预测 | |
| 血脑屏障通透性预测 | |
| hERG、药物性肝损伤、致突变性预测 | |
| CYP450抑制/底物特性预测 | |
| 预测化合物的蛋白靶点 | |
| 查找可商业采购的化合物 | |
| 基于结构的商业可得性搜索 | |
| 查找特定化合物的供应商 | |
| 搜索Enamine筛选库 | |
| 通过结构搜索Enamine库 | |
| 列出Enamine的化合物库 | (无必填参数) |
Workflow
工作流程
Phase 1: Compound Identification
阶段1:化合物鉴定
undefinedundefinedStep 1: Name -> CID (PubChem canonical identity)
步骤1:名称 -> CID (PubChem标准标识)
PubChem_get_CID_by_compound_name(compound_name="imatinib")
PubChem_get_CID_by_compound_name(compound_name="imatinib")
-> CID: 5291
-> CID: 5291
Step 2: Get SMILES and properties (needed for all downstream tools)
步骤2:获取SMILES和属性(所有下游工具必需)
PubChem_get_compound_properties_by_CID(
cid="5291",
properties="MolecularFormula,MolecularWeight,CanonicalSMILES,InChIKey,IUPACName"
)
PubChem_get_compound_properties_by_CID(
cid="5291",
properties="MolecularFormula,MolecularWeight,CanonicalSMILES,InChIKey,IUPACName"
)
-> canonical SMILES, InChIKey (global identifier)
-> 标准SMILES, InChIKey (全局标识)
Step 3: Get ChEMBL ID (for activity data)
步骤3:获取ChEMBL ID(用于获取活性数据)
ChEMBL_search_molecules(query="imatinib")
ChEMBL_search_molecules(query="imatinib")
-> ChEMBL ID (e.g., "CHEMBL941")
-> ChEMBL ID (例如:"CHEMBL941")
Step 4: Get all synonyms (brand names, INN, etc.)
步骤4:获取所有同义词(商品名、国际非专利名等)
PubChem_get_compound_synonyms_by_CID(cid="5291")
**ID resolution priority**:
1. Start with PubChem CID (most universal)
2. Get ChEMBL ID (for bioactivity data)
3. Use canonical SMILES for structure-based searches and ADMETPubChem_get_compound_synonyms_by_CID(cid="5291")
**ID解析优先级**:
1. 优先使用PubChem CID(通用性最强)
2. 获取ChEMBL ID(用于生物活性数据查询)
3. 使用标准SMILES开展结构搜索和ADMET预测Phase 2: Structure-Based Search
阶段2:基于结构的搜索
Similarity search (find analogs):
PubChem_search_compounds_by_similarity(
smiles="CANONICAL_SMILES",
threshold=85 # Tanimoto threshold 0-100; 85 = highly similar
)相似度搜索(查找类似物):
PubChem_search_compounds_by_similarity(
smiles="CANONICAL_SMILES",
threshold=85 # Tanimoto阈值0-100;85代表高度相似
)Returns: list of CIDs of similar compounds
返回:相似化合物的CID列表
ChEMBL_search_similar_molecules(query="CHEMBL941") # Or SMILES
ChEMBL_search_similar_molecules(query="CHEMBL941") # 也可传入SMILES
Returns: ChEMBL entries sorted by similarity
返回:按相似度排序的ChEMBL条目
**Substructure search** (find compounds containing a scaffold):PubChem_search_compounds_by_substructure(smiles="SCAFFOLD_SMILES")
**子结构搜索**(查找包含特定骨架的化合物):PubChem_search_compounds_by_substructure(smiles="SCAFFOLD_SMILES")
Returns: CIDs of compounds containing the scaffold
返回:包含该骨架的化合物CID
undefinedundefinedPhase 3: Bioactivity and Binding Affinity
阶段3:生物活性与结合亲和力
Get all activities for a compound (across all targets):
ChEMBL_search_activities(
molecule_chembl_id="CHEMBL941",
pchembl_value__gte=6, # pIC50/Ki >= 6 = IC50/Ki <= 1 µM
limit=50
)获取化合物的所有活性数据(覆盖所有靶点):
ChEMBL_search_activities(
molecule_chembl_id="CHEMBL941",
pchembl_value__gte=6, # pIC50/Ki >= 6 对应 IC50/Ki <= 1 µM
limit=50
)Returns: assay_type, target_name, pchembl_value, units
返回:实验类型、靶点名称、pchembl值、单位
**Get all ligands for a target**:
**获取靶点的所有配体**:First find target ChEMBL ID
首先查找靶点的ChEMBL ID
ChEMBL_search_targets(query="EGFR", organism="Homo sapiens")
ChEMBL_search_targets(query="EGFR", organism="Homo sapiens")
-> target_chembl_id, e.g., "CHEMBL203"
-> target_chembl_id, 例如:"CHEMBL203"
ChEMBL_get_target_activities(
target_chembl_id="CHEMBL203"
)
ChEMBL_get_target_activities(
target_chembl_id="CHEMBL203"
)
Returns: all compounds with binding data against this target
返回:该靶点所有有结合数据的化合物
**BindingDB** (when available — often times out):BindingDB_get_ligands_by_uniprot(uniprot_id="P00533") # EGFR
**BindingDB**(可用时使用,经常超时):BindingDB_get_ligands_by_uniprot(uniprot_id="P00533") # EGFR
Returns: Ki, IC50, Kd data with literature references
返回:带文献引用的Ki、IC50、Kd数据
Note: BindingDB REST API is frequently unavailable; fall back to ChEMBL
注意:BindingDB REST API经常不可用;请回退使用ChEMBL
**pChEMBL Value interpretation**:
| pChEMBL | IC50 / Ki | Affinity |
|---------|-----------|---------|
| >= 9 | <= 1 nM | Very potent |
| >= 7 | <= 100 nM | Potent |
| >= 6 | <= 1 µM | Moderate |
| >= 5 | <= 10 µM | Weak |
| < 5 | > 10 µM | Inactive |
**pChEMBL值解读**:
| pChEMBL | IC50 / Ki | 亲和力 |
|---------|-----------|---------|
| >= 9 | <= 1 nM | 活性极强 |
| >= 7 | <= 100 nM | 活性强 |
| >= 6 | <= 1 µM | 活性中等 |
| >= 5 | <= 10 µM | 活性弱 |
| < 5 | > 10 µM | 无活性 |Phase 4: Drug-likeness and ADMET
阶段4:类药性与ADMET
SwissADME (comprehensive, requires SMILES string — not list):
SwissADME_calculate_adme(
operation="calculate_adme",
smiles="CANONICAL_SMILES"
)SwissADME(功能全面,要求SMILES为字符串,不能是列表):
SwissADME_calculate_adme(
operation="calculate_adme",
smiles="CANONICAL_SMILES"
)Returns: physicochemical, lipophilicity, water solubility, pharmacokinetics,
返回:理化属性、亲脂性、水溶性、药代动力学、
drug-likeness scores (Lipinski, Veber, Egan, Muegge), PAINS alerts
类药性评分(Lipinski、Veber、Egan、Muegge)、PAINS警报
SwissADME_check_druglikeness(
operation="check_druglikeness",
smiles="CANONICAL_SMILES"
)
SwissADME_check_druglikeness(
operation="check_druglikeness",
smiles="CANONICAL_SMILES"
)
Returns: Lipinski/Veber/Egan pass/fail + lead-likeness
返回:Lipinski/Veber/Egan规则校验结果 + 先导化合物相似性
**ADMET-AI** (ML-based, requires SMILES as list — install tooluniverse[ml]):ADMETAI_predict_physicochemical_properties(smiles=["CANONICAL_SMILES"])
ADMETAI_predict_bioavailability(smiles=["CANONICAL_SMILES"])
ADMETAI_predict_BBB_penetrance(smiles=["CANONICAL_SMILES"])
ADMETAI_predict_toxicity(smiles=["CANONICAL_SMILES"])
ADMETAI_predict_CYP_interactions(smiles=["CANONICAL_SMILES"])
**Note**: ADMET-AI requires `pip install tooluniverse[ml]`. If unavailable, use SwissADME as fallback.
**Key drug-likeness rules**:
- **Lipinski Ro5**: MW <= 500, logP <= 5, HBD <= 5, HBA <= 10 (oral drugs)
- **Veber**: TPSA <= 140 Ų, rotatable bonds <= 10 (oral bioavailability)
- **Lead-like**: MW <= 350, logP <= 3, HBD <= 3, HBA <= 6 (fragment/lead)
**ADMET-AI**(基于机器学习,要求SMILES为列表——需安装tooluniverse[ml]):ADMETAI_predict_physicochemical_properties(smiles=["CANONICAL_SMILES"])
ADMETAI_predict_bioavailability(smiles=["CANONICAL_SMILES"])
ADMETAI_predict_BBB_penetrance(smiles=["CANONICAL_SMILES"])
ADMETAI_predict_toxicity(smiles=["CANONICAL_SMILES"])
ADMETAI_predict_CYP_interactions(smiles=["CANONICAL_SMILES"])
**注意**:ADMET-AI需要执行`pip install tooluniverse[ml]`。如果不可用,使用SwissADME作为替代。
**核心类药性规则**:
- **Lipinski五规则**: 分子量 <= 500, logP <= 5, 氢键供体 <= 5, 氢键受体 <= 10(口服药物)
- **Veber规则**: TPSA <= 140 Ų, 可旋转键 <= 10(口服生物利用度)
- **先导化合物规则**: 分子量 <= 350, logP <= 3, 氢键供体 <= 3, 氢键受体 <= 6(片段/先导化合物)Phase 5: Target Prediction
阶段5:靶点预测
When you have a novel compound and want to predict targets:
SwissTargetPrediction_predict(
operation="predict",
smiles="CANONICAL_SMILES"
)当你有一个新化合物需要预测作用靶点时:
SwissTargetPrediction_predict(
operation="predict",
smiles="CANONICAL_SMILES"
)Returns: predicted protein targets with probability scores
返回:带概率评分的预测蛋白靶点
Note: SwissTargetPrediction uses structure-similarity to known drug-target pairs
注意:SwissTargetPrediction基于与已知药物-靶点对的结构相似度
May time out for complex molecules
复杂分子可能会超时
undefinedundefinedPhase 6: Commercial Availability
阶段6:商业可得性
eMolecules (aggregates 200+ suppliers — returns search URL, not direct data):
eMolecules_search(query="compound_name")eMolecules(整合200+供应商——返回搜索URL,不返回直接数据):
eMolecules_search(query="compound_name")-> Returns search_url to visit on eMolecules.com
-> 返回eMolecules.com的搜索链接
eMolecules_search_smiles(smiles="CANONICAL_SMILES")
eMolecules_search_smiles(smiles="CANONICAL_SMILES")
-> Returns URL for exact/similar structure search
-> 返回精确/相似结构搜索的URL
**Enamine** (37B+ make-on-demand compounds — returns URL when API unavailable):Enamine_search_catalog(query="compound_name")
**Enamine**(370亿+按需合成化合物——API不可用时返回URL):Enamine_search_catalog(query="compound_name")
-> If API available: returns catalog entries with catalog_id, price
-> 如果API可用:返回包含目录ID、价格的目录条目
-> If API unavailable: returns search_url for manual search
-> 如果API不可用:返回手动搜索的URL
Enamine_search_smiles(smiles="CANONICAL_SMILES")
Enamine_search_smiles(smiles="CANONICAL_SMILES")
-> Exact or similarity structure search
-> 精确或相似结构搜索
Enamine_get_libraries()
Enamine_get_libraries()
-> Lists available Enamine screening collections
-> 列出可用的Enamine筛选库
**Note**: eMolecules and Enamine APIs frequently return search URLs rather than live data. Present these to the user as "search here" links.
---
**注意**:eMolecules和Enamine API通常返回搜索链接而非实时数据。请将这些作为「在此搜索」的链接提供给用户。
---Tool Parameter Reference
工具参数参考
| Tool | Required Params | Notes |
|---|---|---|
| | Returns list of CIDs; take first or most relevant |
| | Use canonical SMILES |
| | |
| | |
| | Returns CIDs matching scaffold |
| | Name, ChEMBL ID, or InChIKey |
| | Full format: "CHEMBL941" not "941" |
| | SMILES or ChEMBL ID |
| | Use |
| | For approved drugs only |
| | Add |
| | Returns all ligands for target |
| | SMILES as string (not list) |
| | SMILES as string |
| | Must be a list: |
| | May time out |
| | Returns search URL (no live data) |
| | Canonical SMILES |
| | eMolecules internal ID |
| | Returns URL when API unavailable |
| | |
| | Enamine-specific catalog ID |
| | Frequently unavailable — use ChEMBL as fallback |
| | SMILES-based target lookup |
| 工具 | 必填参数 | 说明 |
|---|---|---|
| | 返回CID列表;取第一个或最相关的结果 |
| | 使用标准SMILES |
| | |
| | |
| | 返回匹配骨架的CID |
| | 名称、ChEMBL ID或InChIKey |
| | 完整格式:「CHEMBL941」而非「941」 |
| | SMILES或ChEMBL ID |
| | 使用 |
| | 仅适用于获批药物 |
| | 添加 |
| | 返回靶点的所有配体 |
| | SMILES为字符串(非列表) |
| | SMILES为字符串 |
| | 必须是列表: |
| | 可能超时 |
| | 返回搜索URL(无实时数据) |
| | 标准SMILES |
| | eMolecules内部ID |
| | API不可用时返回URL |
| | |
| | Enamine专属目录ID |
| | 经常不可用——使用ChEMBL作为替代 |
| | 基于SMILES的靶点查询 |
Common Patterns
常用模式
Pattern 1: Full Compound Profile
模式1:完整化合物档案
Input: Compound name (e.g., "imatinib")
Flow:
1. PubChem_get_CID_by_compound_name -> CID + SMILES
2. ChEMBL_search_molecules -> ChEMBL ID
3. PubChem_get_compound_properties_by_CID -> physicochemical props
4. SwissADME_calculate_adme / ADMETAI_predict_* -> ADMET profile
5. ChEMBL_search_activities(molecule_chembl_id) -> binding data
6. ChEMBL_get_drug_mechanisms -> MOA (if approved drug)
Output: Complete compound profile with identity, ADMET, and activity data输入:化合物名称(例如:「imatinib」)
流程:
1. PubChem_get_CID_by_compound_name -> CID + SMILES
2. ChEMBL_search_molecules -> ChEMBL ID
3. PubChem_get_compound_properties_by_CID -> 理化属性
4. SwissADME_calculate_adme / ADMETAI_predict_* -> ADMET档案
5. ChEMBL_search_activities(molecule_chembl_id) -> 结合数据
6. ChEMBL_get_drug_mechanisms -> 作用机制(如果是获批药物)
输出:包含标识、ADMET、活性数据的完整化合物档案Pattern 2: Analog Discovery
模式2:类似物发现
Input: Reference compound SMILES
Flow:
1. PubChem_search_compounds_by_similarity(smiles, threshold=85) -> similar CIDs
2. ChEMBL_search_similar_molecules(query=smiles) -> ChEMBL analogs
3. For each hit: PubChem_get_compound_properties_by_CID -> properties
4. SwissADME_check_druglikeness -> filter by drug-likeness
Output: Ranked list of analogs with activity data and drug-likeness scores输入:参考化合物SMILES
流程:
1. PubChem_search_compounds_by_similarity(smiles, threshold=85) -> 相似CID
2. ChEMBL_search_similar_molecules(query=smiles) -> ChEMBL类似物
3. 对每个命中化合物:PubChem_get_compound_properties_by_CID -> 属性
4. SwissADME_check_druglikeness -> 按类药性过滤
输出:带活性数据和类药性评分的排序类似物列表Pattern 3: Target-Based Compound Search
模式3:基于靶点的化合物搜索
Input: Target name (e.g., "EGFR")
Flow:
1. ChEMBL_search_targets(query="EGFR", organism="Homo sapiens") -> target_chembl_id
2. ChEMBL_get_target_activities(target_chembl_id) -> all ligands with Ki/IC50
3. Filter by pchembl_value >= 7 (potent compounds)
4. For top hits: SwissADME_check_druglikeness -> assess drug-likeness
5. eMolecules_search(query=compound_name) -> check commercial availability
Output: Prioritized list of potent, drug-like, commercially available compounds输入:靶点名称(例如:「EGFR」)
流程:
1. ChEMBL_search_targets(query="EGFR", organism="Homo sapiens") -> target_chembl_id
2. ChEMBL_get_target_activities(target_chembl_id) -> 所有带Ki/IC50的配体
3. 按pchembl_value >= 7过滤(高活性化合物)
4. 对 top 命中化合物:SwissADME_check_druglikeness -> 评估类药性
5. eMolecules_search(query=compound_name) -> 核查商业可得性
输出:高活性、类药、可商业采购的优先级化合物列表Pattern 4: ADMET Risk Assessment
模式4:ADMET风险评估
Input: Novel compound SMILES
Flow:
1. SwissADME_calculate_adme(operation="calculate_adme", smiles) -> full ADMET
2. ADMETAI_predict_toxicity(smiles=[smiles]) -> hERG, DILI, mutagenicity
3. ADMETAI_predict_CYP_interactions(smiles=[smiles]) -> drug-drug interaction risk
4. ADMETAI_predict_BBB_penetrance(smiles=[smiles]) -> CNS penetration
Output: ADMET risk profile with flagged liabilities输入:新化合物SMILES
流程:
1. SwissADME_calculate_adme(operation="calculate_adme", smiles) -> 完整ADMET
2. ADMETAI_predict_toxicity(smiles=[smiles]) -> hERG、药物性肝损伤、致突变性
3. ADMETAI_predict_CYP_interactions(smiles=[smiles]) -> 药物相互作用风险
4. ADMETAI_predict_BBB_penetrance(smiles=[smiles]) -> 中枢神经系统渗透性
输出:标记风险点的ADMET风险档案Fallback Chains
回退链路
| Primary | Fallback | When |
|---|---|---|
| | BindingDB API unavailable |
| | ml dependencies not installed |
| Returns URL only | API returns HTTP 500 (common) |
| | Prediction times out |
| | Name not in PubChem |
| 首选工具 | 回退方案 | 触发场景 |
|---|---|---|
| | BindingDB API不可用 |
| | 未安装机器学习依赖 |
| 仅返回URL | API返回HTTP 500(常见) |
| | 预测超时 |
| | 名称不在PubChem中 |
Limitations
限制说明
- BindingDB: REST API frequently times out; ChEMBL is the reliable alternative for binding data
- Enamine API: Returns HTTP 500 often; tool provides search URL as fallback
- eMolecules: No public API; tool generates search URLs only
- ADMET-AI: Requires ; not always available in base install
pip install tooluniverse[ml] - SwissTargetPrediction: Web scraping-based; may time out for complex molecules
- SMILES format: ADMET-AI requires a list ; SwissADME requires a string
["SMILES"]"SMILES" - ChEMBL IDs: Always use full format , never just
"CHEMBL941""941"
- BindingDB: REST API经常超时;ChEMBL是结合数据的可靠替代方案
- Enamine API: 经常返回HTTP 500;工具会提供搜索URL作为回退
- eMolecules: 无公开API;工具仅生成搜索URL
- ADMET-AI: 需要;基础安装中不一定可用
pip install tooluniverse[ml] - SwissTargetPrediction: 基于网页爬取;复杂分子可能超时
- SMILES格式: ADMET-AI要求传入列表 ;SwissADME要求传入字符串
["SMILES"]"SMILES" - ChEMBL ID: 始终使用完整格式 ,不要仅使用
"CHEMBL941""941"