boltzgen
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBoltzGen All-Atom Design
BoltzGen全原子蛋白质设计
Prerequisites
前置要求
| Requirement | Minimum | Recommended |
|---|---|---|
| Python | 3.10+ | 3.11 |
| CUDA | 12.0+ | 12.1+ |
| GPU VRAM | 24GB | 48GB (L40S) |
| RAM | 32GB | 64GB |
| 要求 | 最低配置 | 推荐配置 |
|---|---|---|
| Python | 3.10及以上 | 3.11 |
| CUDA | 12.0及以上 | 12.1及以上 |
| GPU显存 | 24GB | 48GB(L40S) |
| 内存 | 32GB | 64GB |
How to run
运行方法
First time? See Installation Guide to set up Modal and biomodals.
**首次使用?**请查看安装指南设置Modal和biomodals。
Option 1: Modal (recommended)
选项1:Modal(推荐)
bash
undefinedbash
undefinedClone biomodals
Clone biomodals
git clone https://github.com/hgbrian/biomodals && cd biomodals
git clone https://github.com/hgbrian/biomodals && cd biomodals
Run BoltzGen (requires YAML config file)
Run BoltzGen (requires YAML config file)
modal run modal_boltzgen.py
--input-yaml binder_config.yaml
--protocol protein-anything
--num-designs 50
--input-yaml binder_config.yaml
--protocol protein-anything
--num-designs 50
modal run modal_boltzgen.py
--input-yaml binder_config.yaml
--protocol protein-anything
--num-designs 50
--input-yaml binder_config.yaml
--protocol protein-anything
--num-designs 50
With custom GPU
With custom GPU
GPU=L40S modal run modal_boltzgen.py
--input-yaml binder_config.yaml
--protocol protein-anything
--num-designs 100
--input-yaml binder_config.yaml
--protocol protein-anything
--num-designs 100
**GPU**: L40S (48GB) recommended | **Timeout**: 120min default
**Available protocols**: `protein-anything`, `peptide-anything`, `protein-small_molecule`, `nanobody-anything`, `antibody-anything`GPU=L40S modal run modal_boltzgen.py
--input-yaml binder_config.yaml
--protocol protein-anything
--num-designs 100
--input-yaml binder_config.yaml
--protocol protein-anything
--num-designs 100
**GPU**:推荐使用L40S(48GB) | **超时时间**:默认120分钟
**可用协议**:`protein-anything`、`peptide-anything`、`protein-small_molecule`、`nanobody-anything`、`antibody-anything`Option 2: Local installation
选项2:本地安装
bash
git clone https://github.com/HannesStark/boltzgen.git
cd boltzgen
pip install -e .
python sample.py config=config.yamlbash
git clone https://github.com/HannesStark/boltzgen.git
cd boltzgen
pip install -e .
python sample.py config=config.yamlOption 3: Python API
选项3:Python API
python
from boltzgen import BoltzGen
model = BoltzGen.load_pretrained()
designs = model.sample(
target_pdb="target.pdb",
num_samples=50,
binder_length=80
)GPU: L40S (48GB) | Time: ~30-60s per design
python
from boltzgen import BoltzGen
model = BoltzGen.load_pretrained()
designs = model.sample(
target_pdb="target.pdb",
num_samples=50,
binder_length=80
)GPU:L40S(48GB) | 时间:每个设计约30-60秒
Key parameters (CLI)
关键参数(命令行界面)
| Parameter | Default | Description |
|---|---|---|
| required | Path to YAML design specification |
| | Design protocol |
| 10 | Number of designs to generate |
| all | Pipeline steps to run (e.g., |
| 参数 | 默认值 | 描述 |
|---|---|---|
| 必填 | YAML设计规范文件路径 |
| | 设计协议 |
| 10 | 要生成的设计数量 |
| 全部 | 要运行的流水线步骤(例如: |
YAML configuration
YAML配置
BoltzGen uses an entity-based YAML format where you specify designed proteins and target structures as entities.
Important notes:
- Residue indices use (1-indexed), not author residue numbers
label_seq_id - File paths are relative to the YAML file location
- Target files should be in CIF format (PDB also works but CIF preferred)
- Run to verify your specification before running
boltzgen check config.yaml
BoltzGen采用基于实体的YAML格式,你可以在其中将待设计的蛋白质和目标结构指定为实体。
重要说明:
- 残基索引使用(从1开始计数),而非作者定义的残基编号
label_seq_id - 文件路径相对于YAML文件所在位置
- 目标文件应为CIF格式(PDB格式也可使用,但优先推荐CIF)
- 运行前请执行验证你的配置
boltzgen check config.yaml
Basic Binder Config
基础结合剂配置
yaml
entities:
# Designed protein (variable length 80-140 residues)
- protein:
id: B
sequence: 80..140
# Target from structure file
- file:
path: target.cif
include:
- chain:
id: A
# Specify binding site residues (optional but recommended)
binding_types:
- chain:
id: A
binding: 45,67,89yaml
entities:
# Designed protein (variable length 80-140 residues)
- protein:
id: B
sequence: 80..140
# Target from structure file
- file:
path: target.cif
include:
- chain:
id: A
# Specify binding site residues (optional but recommended)
binding_types:
- chain:
id: A
binding: 45,67,89Binder with Specific Binding Site
带特定结合位点的结合剂配置
yaml
entities:
- protein:
id: G
sequence: 60..100
- file:
path: 5cqg.cif
include:
- chain:
id: A
binding_types:
- chain:
id: A
binding: 343,344,251
structure_groups: "all"yaml
entities:
- protein:
id: G
sequence: 60..100
- file:
path: 5cqg.cif
include:
- chain:
id: A
binding_types:
- chain:
id: A
binding: 343,344,251
structure_groups: "all"Peptide Design (Cyclic)
环状肽设计
yaml
entities:
- protein:
id: S
sequence: 10..14C6C3 # With cysteines for disulfide
- file:
path: target.cif
include:
- chain:
id: A
constraints:
- bond:
atom1: [S, 11, SG]
atom2: [S, 18, SG] # Disulfide bondyaml
entities:
- protein:
id: S
sequence: 10..14C6C3 # With cysteines for disulfide
- file:
path: target.cif
include:
- chain:
id: A
constraints:
- bond:
atom1: [S, 11, SG]
atom2: [S, 18, SG] # Disulfide bondDesign protocols
设计协议
| Protocol | Use Case |
|---|---|
| Design proteins to bind proteins or peptides |
| Design cyclic peptides to bind proteins |
| Design proteins to bind small molecules |
| Design nanobody CDRs |
| Design antibody CDRs |
| 协议 | 适用场景 |
|---|---|
| 设计与蛋白质或肽结合的蛋白质 |
| 设计与蛋白质结合的环状肽 |
| 设计与小分子结合的蛋白质 |
| 设计纳米抗体CDR区域 |
| 设计抗体CDR区域 |
Output format
输出格式
output/
├── sample_0/
│ ├── design.cif # All-atom structure (CIF format)
│ ├── metrics.json # Confidence scores
│ └── sequence.fasta # Sequence
├── sample_1/
│ └── ...
└── summary.csvNote: BoltzGen outputs CIF format. Convert to PDB if needed:
python
from Bio.PDB import MMCIFParser, PDBIO
parser = MMCIFParser()
structure = parser.get_structure("design", "design.cif")
io = PDBIO()
io.set_structure(structure)
io.save("design.pdb")output/
├── sample_0/
│ ├── design.cif # All-atom structure (CIF format)
│ ├── metrics.json # Confidence scores
│ └── sequence.fasta # Sequence
├── sample_1/
│ └── ...
└── summary.csv注意:BoltzGen输出CIF格式。如需转换为PDB格式,请使用以下代码:
python
from Bio.PDB import MMCIFParser, PDBIO
parser = MMCIFParser()
structure = parser.get_structure("design", "design.cif")
io = PDBIO()
io.set_structure(structure)
io.save("design.pdb")Sample output
示例输出
Successful run
运行成功示例
$ modal run modal_boltzgen.py --input-yaml binder.yaml --protocol protein-anything --num-designs 10
Running: boltzgen run binder.yaml --output /tmp/out --protocol protein-anything --num_designs 10
[INFO] Loading BoltzGen model...
[INFO] Generating designs...
[INFO] Running inverse folding...
[INFO] Running structure prediction...
[INFO] Filtering and ranking...
[INFO] Pipeline complete
Results saved to: ./out/boltzgen/2501161234/Output directory structure:
out/boltzgen/2501161234/
├── intermediate_designs/ # Raw diffusion outputs
│ ├── design_0.cif
│ └── design_0.npz
├── intermediate_designs_inverse_folded/
│ ├── refold_cif/ # Refolded complexes
│ └── aggregate_metrics_analyze.csv
└── final_ranked_designs/
├── final_10_designs/ # Top designs
└── results_overview.pdf # Summary plotsWhat good output looks like:
- Refolding RMSD < 2.0A (design folds as predicted)
- ipTM > 0.5 (confident interface)
- All designs complete pipeline without errors
$ modal run modal_boltzgen.py --input-yaml binder.yaml --protocol protein-anything --num-designs 10
Running: boltzgen run binder.yaml --output /tmp/out --protocol protein-anything --num_designs 10
[INFO] Loading BoltzGen model...
[INFO] Generating designs...
[INFO] Running inverse folding...
[INFO] Running structure prediction...
[INFO] Filtering and ranking...
[INFO] Pipeline complete
Results saved to: ./out/boltzgen/2501161234/输出目录结构:
out/boltzgen/2501161234/
├── intermediate_designs/ # Raw diffusion outputs
│ ├── design_0.cif
│ └── design_0.npz
├── intermediate_designs_inverse_folded/
│ ├── refold_cif/ # Refolded complexes
│ └── aggregate_metrics_analyze.csv
└── final_ranked_designs/
├── final_10_designs/ # Top designs
└── results_overview.pdf # Summary plots优质输出特征:
- 重折叠RMSD < 2.0Å(设计结构与预测一致)
- ipTM > 0.5(结合界面可信度高)
- 所有设计均顺利完成流水线无错误
Decision tree
决策树
Should I use BoltzGen?
│
├─ What type of design?
│ ├─ All-atom precision needed → BoltzGen ✓
│ ├─ Ligand binding pocket → BoltzGen ✓
│ └─ Standard miniprotein → RFdiffusion (faster)
│
├─ What matters most?
│ ├─ Side-chain packing → BoltzGen ✓
│ ├─ Speed / diversity → RFdiffusion
│ ├─ Highest success rate → BindCraft
│ └─ AF2 optimization → ColabDesign
│
└─ Compute resources?
├─ Have L40S/A100 (48GB+) → BoltzGen ✓
└─ Only A10G (24GB) → Consider RFdiffusionShould I use BoltzGen?
│
├─ What type of design?
│ ├─ All-atom precision needed → BoltzGen ✓
│ ├─ Ligand binding pocket → BoltzGen ✓
│ └─ Standard miniprotein → RFdiffusion (faster)
│
├─ What matters most?
│ ├─ Side-chain packing → BoltzGen ✓
│ ├─ Speed / diversity → RFdiffusion
│ ├─ Highest success rate → BindCraft
│ └─ AF2 optimization → ColabDesign
│
└─ Compute resources?
├─ Have L40S/A100 (48GB+) → BoltzGen ✓
└─ Only A10G (24GB) → Consider RFdiffusionTypical performance
典型性能
| Campaign Size | Time (L40S) | Cost (Modal) | Notes |
|---|---|---|---|
| 50 designs | 30-45 min | ~$8 | Quick exploration |
| 100 designs | 1-1.5h | ~$15 | Standard campaign |
| 500 designs | 5-8h | ~$70 | Large campaign |
Per-design: ~30-60s for typical binder.
| 任务规模 | 耗时(L40S) | 成本(Modal) | 说明 |
|---|---|---|---|
| 50个设计 | 30-45分钟 | 约8美元 | 快速探索 |
| 100个设计 | 1-1.5小时 | 约15美元 | 标准任务 |
| 500个设计 | 5-8小时 | 约70美元 | 大规模任务 |
单个设计:典型结合剂约30-60秒。
Verify
验证
bash
find output -name "*.cif" | wc -l # Should match num_samplesbash
find output -name "*.cif" | wc -l # Should match num_samplesTroubleshooting
故障排除
Verify config first: Always run before running the full pipeline
Slow generation: Use fewer designs for initial testing, then scale up
OOM errors: Use A100-80GB or reduce
Wrong binding site: Residue indices use (1-indexed), check in Molstar viewer
boltzgen check config.yaml--num-designslabel_seq_id先验证配置:运行完整流水线前,请始终执行
生成速度慢:初始测试时减少设计数量,再逐步扩容
内存不足错误:使用A100-80GB或减少设计数量
结合位点错误:残基索引使用(从1开始计数),请在Molstar查看器中确认
boltzgen check config.yamllabel_seq_idError interpretation
错误解析
| Error | Cause | Fix |
|---|---|---|
| Large design or long protein | Use A100-80GB or reduce designs |
| Target file not found | File paths are relative to YAML location |
| Chain not in target | Verify chain IDs with Molstar or PyMOL |
| Modal CLI not installed | Run |
Next: Validate with or → for filtering.
boltzchaiprotein-qc| 错误 | 原因 | 解决方法 |
|---|---|---|
| 设计规模大或蛋白质序列长 | 使用A100-80GB或减少设计数量 |
| 目标文件未找到 | 文件路径相对于YAML文件所在位置 |
| 目标中无该链 | 使用Molstar或PyMOL验证链ID |
| 未安装Modal命令行工具 | 执行 |
下一步:使用或 → 进行过滤验证。
boltzchaiprotein-qc