tooluniverse-cancer-classification
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCancer Classification via OncoTree
基于OncoTree的癌症分类
Standardize cancer type nomenclature using the OncoTree ontology. Resolves free-text tumor
descriptions to structured codes with UMLS/NCI cross-references, enabling downstream use in
OncoKB variant annotation and GDC cohort selection.
使用OncoTree本体标准化癌症类型命名。将自由文本肿瘤描述解析为带有UMLS/NCI交叉引用的结构化代码,支持下游用于OncoKB变异注释和GDC队列筛选。
When to Use
适用场景
Apply when researcher asks about:
- "What is the OncoTree code for [tumor description]?"
- "Find all subtypes of [cancer type]"
- "What cancers originate in [tissue]?"
- "I need the tumor type code for OncoKB annotation"
- "What is the TCGA/COSMIC code for [cancer]?"
- "List all CNS/Brain cancer subtypes"
- "What NCI code corresponds to glioblastoma?"
当研究人员提出以下问题时适用:
- "[肿瘤描述]对应的OncoTree代码是什么?"
- "查找[癌症类型]的所有亚型"
- "哪些癌症起源于[组织]?"
- "我需要用于OncoKB注释的肿瘤类型代码"
- "[癌症]对应的TCGA/COSMIC代码是什么?"
- "列出所有中枢神经系统/脑部癌症亚型"
- "胶质母细胞瘤对应的NCI代码是什么?"
Key Tools
核心工具
| Tool | Purpose | Key Params |
|---|---|---|
| Free-text search for cancer types | |
| Full details for a known OncoTree code | |
| List all 32 tissue categories | (no params) |
| Variant annotation using OncoTree code | |
| Pan-cancer mutation frequency (TCGA) | |
| 工具 | 用途 | 关键参数 |
|---|---|---|
| 癌症类型自由文本搜索 | |
| 获取已知OncoTree代码的完整详情 | |
| 列出全部32种组织类别 | (无参数) |
| 使用OncoTree代码进行变异注释 | |
| 泛癌突变频率(TCGA) | |
Workflow
工作流程
Phase 1: Cancer Type Discovery
阶段1:癌症类型发现
Start with free-text search to find matching OncoTree codes:
OncoTree_search(query="breast cancer")
-> Returns list: code, name, main_type, tissue, parent, level, external_referencesKey response fields:
- : OncoTree code (e.g., "BRCA", "IBC") — use this in OncoKB calls
code - : hierarchy depth (1=tissue, 2=main type, 3-5=subtypes)
level - : parent node code for navigating the hierarchy
parent - : UMLS CUI list
external_references.UMLS - : NCI thesaurus code list
external_references.NCI
Search tips:
- Broad terms ("lung cancer") return many results; narrow by tissue or level
- Use tissue-specific terms ("invasive breast carcinoma") for precise matching
- Acronyms work: query="GBM" finds glioblastoma, query="AML" finds leukemia types
首先通过自由文本搜索找到匹配的OncoTree代码:
OncoTree_search(query="breast cancer")
-> 返回列表:code, name, main_type, tissue, parent, level, external_references关键响应字段:
- : OncoTree代码(例如:"BRCA", "IBC")——用于OncoKB调用
code - : 层级深度(1=组织,2=主类型,3-5=亚型)
level - : 用于导航层级结构的父节点代码
parent - : UMLS CUI列表
external_references.UMLS - : NCI叙词表代码列表
external_references.NCI
搜索技巧:
- 宽泛术语(如"肺癌")会返回大量结果;可通过组织或层级缩小范围
- 使用组织特异性术语(如"浸润性乳腺癌")以获得精确匹配
- 缩写有效:query="GBM"会找到胶质母细胞瘤,query="AML"会找到白血病类型
Phase 2: Code Validation and Detail Retrieval
阶段2:代码验证与详情检索
Once you have a candidate code, retrieve full details:
OncoTree_get_type(code="LUAD")
-> Returns: name, main_type, tissue, color, parent, level, history, external_referencesNote: Not all codes are valid. "GBM" returns 404 — correct code is "GB" (Glioblastoma, IDH-Wildtype).
Always validate via before using in downstream tools.
OncoTree_get_type获得候选代码后,检索完整详情:
OncoTree_get_type(code="LUAD")
-> 返回:name, main_type, tissue, color, parent, level, history, external_references注意:并非所有代码都有效。"GBM"会返回404——正确代码为"GB"(IDH野生型胶质母细胞瘤)。在下游工具中使用前,务必通过进行验证。
OncoTree_get_typePhase 3: Tissue-Level Exploration
阶段3:组织层级探索
When the user wants all cancers in a tissue category:
OncoTree_list_tissues()
-> Returns 32 tissue names: "Breast", "CNS/Brain", "Lung", "Myeloid", ...
OncoTree_search(query="CNS/Brain")
-> All cancer types with tissue="CNS/Brain"当用户需要某一组织类别下的所有癌症时:
OncoTree_list_tissues()
-> 返回32种组织名称:"Breast", "CNS/Brain", "Lung", "Myeloid", ...
OncoTree_search(query="CNS/Brain")
-> 所有tissue="CNS/Brain"的癌症类型Phase 4: Downstream Use in Variant Annotation
阶段4:下游用于变异注释
Pass validated OncoTree code to OncoKB for cancer-type-specific therapeutic levels:
OncoKB_annotate_variant(gene="EGFR", variant="L858R", tumor_type="LUAD")
-> highestSensitiveLevel: "1" (FDA-approved therapy for this tumor+variant)Without , OncoKB returns pan-cancer levels which may be less specific.
tumor_type将验证后的OncoTree代码传入OncoKB,获取癌症类型特异性治疗等级:
OncoKB_annotate_variant(gene="EGFR", variant="L858R", tumor_type="LUAD")
-> highestSensitiveLevel: "1"(针对该肿瘤+变异的FDA批准疗法)若不传入,OncoKB会返回泛癌等级,特异性可能不足。
tumor_typeTool Parameter Reference
工具参数参考
| Tool | Required | Optional | Notes |
|---|---|---|---|
| | — | Free text; returns list sorted by relevance |
| | — | Case-sensitive; "BRCA" not "brca". Returns 404 for invalid codes |
| — | — | No params; returns list of 32 tissue strings |
| | | |
| | — | Pan-cancer TCGA only; no per-subtype breakdown |
| 工具 | 必填参数 | 可选参数 | 说明 |
|---|---|---|---|
| | — | 自由文本;返回按相关性排序的列表 |
| | — | 区分大小写;"BRCA"而非"brca"。无效代码会返回404 |
| — | — | 无参数;返回32种组织字符串的列表 |
| | | |
| | — | 仅支持TCGA泛癌;无亚型细分数据 |
Common OncoTree Codes (verified working)
常用有效OncoTree代码
| Code | Name | Tissue |
|---|---|---|
| Invasive Breast Carcinoma | Breast |
| Lung Adenocarcinoma | Lung |
| Lung Squamous Cell Carcinoma | Lung |
| Melanoma | Skin |
| Colorectal Cancer | Bowel |
| Pancreatic Adenocarcinoma | Pancreas |
| (invalid — use | CNS/Brain |
| Glioblastoma, IDH-Wildtype | CNS/Brain |
| Acute Myeloid Leukemia | Myeloid |
| Prostate Adenocarcinoma | Prostate |
| 代码 | 名称 | 组织 |
|---|---|---|
| 浸润性乳腺癌 | 乳腺 |
| 肺腺癌 | 肺 |
| 肺鳞状细胞癌 | 肺 |
| 黑色素瘤 | 皮肤 |
| 结直肠癌 | 肠 |
| 胰腺腺癌 | 胰腺 |
| (无效——使用 | 中枢神经系统/脑 |
| IDH野生型胶质母细胞瘤 | 中枢神经系统/脑 |
| 急性髓系白血病 | 髓系 |
| 前列腺腺癌 | 前列腺 |
Common Patterns
常用模式
python
undefinedpython
undefinedPattern: Resolve free-text to OncoTree code
模式:将自由文本解析为OncoTree代码
results = OncoTree_search(query="pancreatic ductal adenocarcinoma")
results = OncoTree_search(query="pancreatic ductal adenocarcinoma")
Pick result with lowest level number (most specific match)
选择层级编号最小的结果(最精确匹配)
code = results["data"][0]["code"] # e.g., "PAAD"
code = results["data"][0]["code"] # 例如:"PAAD"
Pattern: Get all subtypes within a main type
模式:获取主类型下的所有亚型
results = OncoTree_search(query="Glioma")
subtypes = [r for r in results["data"] if r["main_type"] == "Glioma"]
results = OncoTree_search(query="Glioma")
subtypes = [r for r in results["data"] if r["main_type"] == "Glioma"]
Pattern: Validate code before OncoKB call
模式:在调用OncoKB前验证代码
detail = OncoTree_get_type(code="GB")
if detail["status"] == "success":
OncoKB_annotate_variant(gene="IDH1", variant="R132H", tumor_type="GB")
undefineddetail = OncoTree_get_type(code="GB")
if detail["status"] == "success":
OncoKB_annotate_variant(gene="IDH1", variant="R132H", tumor_type="GB")
undefinedTumor Classification Reasoning (CRITICAL)
肿瘤分类推理(至关重要)
LOOK UP DON'T GUESS -- tumor classification determines treatment. Always verify codes and biomarker interpretation via tools rather than relying on memory.
务必查询,切勿猜测——肿瘤分类决定治疗方案。始终通过工具验证代码和生物标志物解读,而非依赖记忆。
Histological vs Molecular Classification
组织学分类vs分子分类
Tumors are classified on TWO axes -- both matter for treatment selection:
- Histological (what it looks like under microscope): adenocarcinoma, squamous, small cell, etc. This determines the OncoTree hierarchy level 3+.
- Molecular (what mutations/alterations drive it): EGFR-mutant, HER2-amplified, MSI-high, etc. This determines OncoKB therapeutic levels.
A tumor can be histologically identical to another but molecularly different, requiring different treatment. Example: two lung adenocarcinomas (both LUAD) but one is EGFR-mutant (targeted therapy) and another is KRAS-mutant (different targeted therapy). Always check both axes.
肿瘤从两个维度进行分类——两者对治疗选择都至关重要:
- 组织学(显微镜下形态):腺癌、鳞状细胞癌、小细胞癌等。决定OncoTree层级的3级及以上。
- 分子(驱动突变/改变):EGFR突变型、HER2扩增型、MSI高表达型等。决定OncoKB治疗等级。 两种肿瘤可能组织学相同但分子特征不同,需采用不同治疗方案。例如:两个肺腺癌(均为LUAD),一个是EGFR突变型(靶向治疗),另一个是KRAS突变型(不同靶向治疗)。务必同时检查两个维度。
Biomarker Interpretation Strategy
生物标志物解读策略
When interpreting cancer biomarkers, use OncoKB for actionability:
- HER2: Positive = IHC 3+ or FISH-amplified. Use for therapeutic level
OncoKB_annotate_variant(gene="ERBB2", variant="Amplification", tumor_type="BRCA") - ER/PR: Positive = hormone-receptor positive breast cancer. Changes treatment class (endocrine therapy)
- Ki67: Proliferation index. High (>20%) suggests aggressive biology; used in breast cancer grading (Luminal A vs B)
- TMB (Tumor Mutational Burden): High TMB (>10 mut/Mb) predicts immunotherapy response across tumor types. Use
OncoKB_annotate_variant(gene="Other Biomarkers", variant="TMB-H") - MSI (Microsatellite Instability): MSI-High is FDA-approved biomarker for pembrolizumab pan-cancer. Use
OncoKB_annotate_variant(gene="Other Biomarkers", variant="MSI-H")
解读癌症生物标志物时,使用OncoKB评估临床实用性:
- HER2:阳性=IHC 3+或FISH扩增。使用获取治疗等级
OncoKB_annotate_variant(gene="ERBB2", variant="Amplification", tumor_type="BRCA") - ER/PR:阳性=激素受体阳性乳腺癌。会改变治疗类别(内分泌治疗)
- Ki67:增殖指数。高表达(>20%)提示侵袭性生物学特征;用于乳腺癌分级( Luminal A vs B)
- TMB(肿瘤突变负荷):高TMB(>10 mut/Mb)可预测跨肿瘤类型的免疫治疗响应。使用
OncoKB_annotate_variant(gene="Other Biomarkers", variant="TMB-H") - MSI(微卫星不稳定性):MSI高表达是帕博利珠单抗泛癌适应症的FDA批准生物标志物。使用
OncoKB_annotate_variant(gene="Other Biomarkers", variant="MSI-H")
Staging vs Grading -- Different Concepts
分期vs分级——不同概念
- Stage (TNM): How far has it spread? T=tumor size, N=lymph nodes, M=metastasis. Stage I-IV. Determines prognosis and surgery eligibility.
- Grade: How abnormal do the cells look? Grade 1 (well-differentiated, slow) to Grade 3 (poorly-differentiated, aggressive). Determines aggressiveness.
- A Stage I, Grade 3 tumor (small but aggressive) has different implications than Stage III, Grade 1 (spread but slow-growing).
- 分期(TNM):肿瘤扩散程度?T=肿瘤大小,N=淋巴结,M=转移。分期I-IV。决定预后和手术资格。
- 分级:细胞形态异常程度?分级1(高分化,生长缓慢)至分级3(低分化,侵袭性强)。决定侵袭性。
- I期3级肿瘤(体积小但侵袭性强)与III期1级肿瘤(已扩散但生长缓慢)的临床意义不同。
Actionability Assessment
实用性评估
After classifying the tumor, assess whether findings are clinically actionable:
- Level 1 (FDA-approved, specific tumor type): Immediate treatment implication. Example: EGFR L858R in LUAD
- Level 2 (Standard care): Strong evidence but context-dependent
- Level 3 (Compelling evidence): Clinical trial candidates
- Level 4 (Biological evidence): Research-stage only
- Always provide the OncoTree code to OncoKB -- without it, you get pan-cancer levels which may understate or overstate actionability for the specific tumor type
分类肿瘤后,评估结果是否具有临床实用性:
- 1级(FDA批准,特定肿瘤类型):直接指导治疗。例如:LUAD中的EGFR L858R突变
- 2级(标准治疗):证据充分但需结合临床场景
- 3级(有力证据):临床试验候选者
- 4级(生物学证据):仅处于研究阶段
- 务必向OncoKB提供OncoTree代码——若无该代码,将返回泛癌等级,可能无法准确反映特定肿瘤类型的实用性
Reasoning Framework for Result Interpretation
结果解读推理框架
Evidence Grading
证据分级
| Grade | Criteria | Example |
|---|---|---|
| Confirmed | Exact OncoTree code validated via | LUAD: validated, UMLS C0152013, NCI C3512 |
| Probable | OncoTree search returns match, but code not yet validated or missing cross-refs | Search for "cholangiocarcinoma" returns CHOL with partial external refs |
| Ambiguous | Multiple OncoTree codes match the description at different hierarchy levels | "Breast cancer" matches BRCA (invasive), BREAST (tissue), IBC (inflammatory) |
| Unresolved | No OncoTree match; tumor type too rare or novel for the ontology | Ultra-rare sarcoma subtype not in OncoTree |
| 等级 | 标准 | 示例 |
|---|---|---|
| 确认 | 精确的OncoTree代码已通过 | LUAD:已验证,UMLS C0152013,NCI C3512 |
| 可能 | OncoTree搜索返回匹配结果,但代码尚未验证或缺少交叉引用 | 搜索"胆管癌"返回CHOL,仅部分外部引用 |
| 模糊 | 多个OncoTree代码在不同层级匹配描述 | "乳腺癌"匹配BRCA(浸润性)、BREAST(组织)、IBC(炎性) |
| 未解决 | 无OncoTree匹配结果;肿瘤类型过于罕见或新颖,未纳入本体 | 超罕见肉瘤亚型未收录于OncoTree |
Interpretation Guidance
解读指南
- OncoTree code confidence: Always validate candidate codes with before downstream use. Some common acronyms (e.g., "GBM") are NOT valid OncoTree codes (correct code is "GB"). A validated code with UMLS and NCI cross-references is highest confidence.
OncoTree_get_type - UMLS/NCI cross-reference priority: For standardized reporting, NCI Thesaurus codes are preferred for cancer-specific contexts (used by caDSR, GDC). UMLS CUIs are broader (cross-disease) and useful for literature mining. When both are available, report both; when only one exists, NCI is preferred for oncology workflows.
- Tissue hierarchy interpretation: OncoTree levels represent specificity: Level 1 = tissue of origin (e.g., "Lung"), Level 2 = main cancer type (e.g., "Non-Small Cell Lung Cancer"), Level 3+ = histological subtypes (e.g., "Lung Adenocarcinoma"). For OncoKB variant annotation, use the most specific (deepest) level that accurately describes the tumor. For cohort-level analysis (e.g., TCGA), the Level 2-3 code is typically appropriate.
- OncoKB tumor type impact: Providing a tumor type code to OncoKB can change the therapeutic level (e.g., EGFR L858R is Level 1 in LUAD but Level 3B pan-cancer). Always use the validated OncoTree code for the patient's specific tumor type.
- Deprecated or renamed codes: OncoTree evolves across versions. The field in
historyresponse shows prior names. Always use the current code.OncoTree_get_type
- OncoTree代码可信度:在下游使用前,务必通过验证候选代码。部分常见缩写(如"GBM")并非有效的OncoTree代码(正确代码为"GB")。带有UMLS和NCI交叉引用的已验证代码可信度最高。
OncoTree_get_type - UMLS/NCI交叉引用优先级:对于标准化报告,NCI叙词表代码在癌症特定场景中更受青睐(被caDSR、GDC使用)。UMLS CUI范围更广(跨疾病),适用于文献挖掘。若两者均存在,同时报告;若仅存在其一,肿瘤学工作流优先使用NCI代码。
- 组织层级解读:OncoTree层级代表特异性:1级=起源组织(例如:"肺"),2级=主癌症类型(例如:"非小细胞肺癌"),3级及以上=组织学亚型(例如:"肺腺癌")。对于OncoKB变异注释,使用最能准确描述肿瘤的最具体(最深层)层级代码。对于队列分析(如TCGA),通常使用2-3级代码。
- OncoKB肿瘤类型影响:向OncoKB提供肿瘤类型代码可能改变治疗等级(例如:EGFR L858R在LUAD中为1级,但泛癌为3B级)。始终使用患者特定肿瘤类型的已验证OncoTree代码。
- 已弃用或重命名的代码:OncoTree版本迭代会更新代码。响应中的
OncoTree_get_type字段显示曾用名。始终使用当前有效代码。history
Synthesis Questions
综合问题
- Does the chosen OncoTree code represent the most specific histological subtype, or could a more precise code provide better therapeutic annotation in OncoKB?
- When the free-text tumor description maps to multiple OncoTree codes, which hierarchy level best balances specificity and coverage for the analysis goal (variant annotation vs cohort selection)?
- Are the UMLS/NCI cross-references consistent with external classifications (WHO, ICD-O), or are there discrepancies that need resolution?
- 所选OncoTree代码是否代表最具体的组织学亚型?是否有更精确的代码能在OncoKB中提供更好的治疗注释?
- 当自由文本肿瘤描述匹配多个OncoTree代码时,哪个层级最能平衡分析目标(变异注释vs队列筛选)的特异性和覆盖范围?
- UMLS/NCI交叉引用是否与外部分类标准(WHO、ICD-O)一致?是否存在需要解决的差异?
Fallback Chains
备选流程链
| Primary | Fallback | When |
|---|---|---|
| | 404 for common aliases |
| | Very rare/novel tumor types |
| OncoTree code for OncoKB | Omit | Code not recognized by OncoKB |
| 主流程 | 备选流程 | 触发场景 |
|---|---|---|
| | 常见别名返回404 |
| | 罕见/新型肿瘤类型 |
| OncoTree代码用于OncoKB | 省略 | OncoKB不识别该代码 |