Kimodo Motion Diffusion
Kimodo 运动扩散模型
Skill by
ara.so — Daily 2026 Skills collection.
Kimodo is a kinematic motion diffusion model trained on 700 hours of commercially-friendly optical mocap data. It generates high-quality 3D human and humanoid robot motions controlled through text prompts and kinematic constraints (full-body keyframes, end-effector positions/rotations, 2D paths, 2D waypoints).
由
ara.so提供的Skill — 2026每日Skill合集。
Kimodo是一款运动学运动扩散模型,基于700小时商用友好型光学动作捕捉数据训练而成。它可通过文本提示词和运动学约束(全身关键帧、末端执行器位置/旋转、2D路径、2D航点)控制,生成高质量3D人体及人形机器人运动。
Clone the repository
Clone the repository
Install with pip (creates kimodo_gen and kimodo_demo CLI commands)
Install with pip (creates kimodo_gen and kimodo_demo CLI commands)
Or with Docker (recommended for Windows or clean environments)
Or with Docker (recommended for Windows or clean environments)
docker build -t kimodo .
docker run --gpus all -p 7860:7860 kimodo
**Requirements:**
- ~17GB VRAM (GPU: RTX 3090/4090, A100 recommended)
- Linux (Windows supported via Docker)
- Models download automatically on first use from Hugging Face
docker build -t kimodo .
docker run --gpus all -p 7860:7860 kimodo
**系统要求:**
- 约17GB显存(推荐GPU:RTX 3090/4090、A100)
- Linux系统(Windows可通过Docker支持)
- 首次使用时会自动从Hugging Face下载模型
| Model | Skeleton | Dataset | Use Case |
|---|
| SOMA (human) | Bones Rigplay 1 (700h) | General human motion |
| Unitree G1 (robot) | Bones Rigplay 1 (700h) | Humanoid robot motion |
| SOMA | BONES-SEED (288h) | Benchmarking |
| Unitree G1 | BONES-SEED (288h) | Benchmarking |
| SMPL-X | Bones Rigplay 1 (700h) | Retargeting/AMASS export |
| 模型 | 骨骼结构 | 数据集 | 使用场景 |
|---|
| SOMA(人体) | Bones Rigplay 1(700小时) | 通用人体运动生成 |
| Unitree G1(机器人) | Bones Rigplay 1(700小时) | 人形机器人运动生成 |
| SOMA | BONES-SEED(288小时) | 基准测试 |
| Unitree G1 | BONES-SEED(288小时) | 基准测试 |
| SMPL-X | Bones Rigplay 1(700小时) | 运动重定向/AMASS导出 |
Basic Text-to-Motion
基础文本转运动
Generate a single motion with a text prompt (uses SOMA model by default)
Generate a single motion with a text prompt (uses SOMA model by default)
kimodo_gen "a person walks forward at a moderate pace"
kimodo_gen "a person walks forward at a moderate pace"
Specify duration and number of samples
Specify duration and number of samples
kimodo_gen "a person jogs in a circle" --duration 5.0 --num_samples 3
kimodo_gen "a person jogs in a circle" --duration 5.0 --num_samples 3
Use the G1 robot model
Use the G1 robot model
kimodo_gen "a robot walks forward" --model Kimodo-G1-RP-v1 --duration 4.0
kimodo_gen "a robot walks forward" --model Kimodo-G1-RP-v1 --duration 4.0
Use SMPL-X model (for AMASS-compatible export)
Use SMPL-X model (for AMASS-compatible export)
kimodo_gen "a person waves their right hand" --model Kimodo-SMPLX-RP-v1
kimodo_gen "a person waves their right hand" --model Kimodo-SMPLX-RP-v1
Set a seed for reproducibility
Set a seed for reproducibility
kimodo_gen "a person sits down slowly" --seed 42
kimodo_gen "a person sits down slowly" --seed 42
Control diffusion steps (more = slower but higher quality)
Control diffusion steps (more = slower but higher quality)
kimodo_gen "a person does a jumping jack" --diffusion_steps 50
kimodo_gen "a person does a jumping jack" --diffusion_steps 50
Default: saves NPZ file compatible with web demo
Default: saves NPZ file compatible with web demo
kimodo_gen "a person walks" --output ./outputs/walk.npz
kimodo_gen "a person walks" --output ./outputs/walk.npz
G1 robot: save MuJoCo qpos CSV
G1 robot: save MuJoCo qpos CSV
kimodo_gen "robot walks forward" --model Kimodo-G1-RP-v1 --output ./outputs/walk.csv
kimodo_gen "robot walks forward" --model Kimodo-G1-RP-v1 --output ./outputs/walk.csv
SMPL-X: saves AMASS-compatible NPZ (stem_amass.npz)
SMPL-X: saves AMASS-compatible NPZ (stem_amass.npz)
kimodo_gen "a person waves" --model Kimodo-SMPLX-RP-v1 --output ./outputs/wave.npz
kimodo_gen "a person waves" --model Kimodo-SMPLX-RP-v1 --output ./outputs/wave.npz
Also writes: ./outputs/wave_amass.npz
Also writes: ./outputs/wave_amass.npz
Disable post-processing (foot skate correction, constraint cleanup)
Disable post-processing (foot skate correction, constraint cleanup)
kimodo_gen "a person walks" --no-postprocess
kimodo_gen "a person walks" --no-postprocess
Multi-Prompt Sequences
多提示词序列
Sequence of text prompts for transitions
Sequence of text prompts for transitions
kimodo_gen "a person stands still" "a person walks forward" "a person stops and turns"
kimodo_gen "a person stands still" "a person walks forward" "a person stops and turns"
With timing control per segment
With timing control per segment
kimodo_gen "a person jogs" "a person slows to a walk" "a person stops"
--duration 8.0 --num_samples 2
kimodo_gen "a person jogs" "a person slows to a walk" "a person stops"
--duration 8.0 --num_samples 2
Constraint-Based Generation
基于约束的生成
Load constraints saved from the interactive demo
Load constraints saved from the interactive demo
kimodo_gen "a person walks to a table and picks something up"
--constraints ./my_constraints.json
kimodo_gen "a person walks to a table and picks something up"
--constraints ./my_constraints.json
Combine text and constraints
Combine text and constraints
kimodo_gen "a person performs a complex motion"
--constraints ./keyframe_constraints.json
--model Kimodo-SOMA-RP-v1
--num_samples 5
kimodo_gen "a person performs a complex motion"
--constraints ./keyframe_constraints.json
--model Kimodo-SOMA-RP-v1
--num_samples 5
Access remotely (server setup)
Access remotely (server setup)
kimodo_demo --server-name 0.0.0.0 --server-port 7860
The demo provides:
- Timeline editor for text prompts and constraints
- Full-body keyframe constraints
- 2D root path/waypoint editor
- End-effector position/rotation control
- Real-time 3D visualization with skeleton and skinned mesh
- Export of constraints as JSON and motions as NPZ
kimodo_demo --server-name 0.0.0.0 --server-port 7860
该演示提供以下功能:
- 用于文本提示词和约束的时间轴编辑器
- 全身关键帧约束
- 2D根节点路径/航点编辑器
- 末端执行器位置/旋转控制
- 带骨骼和蒙皮网格的实时3D可视化
- 将约束导出为JSON格式,运动导出为NPZ格式
Low-Level Python API
底层Python API
Basic Model Inference
基础模型推理
python
from kimodo.model import Kimodo
python
from kimodo.model import Kimodo
Initialize model (downloads automatically)
Initialize model (downloads automatically)
model = Kimodo(model_name="Kimodo-SOMA-RP-v1")
model = Kimodo(model_name="Kimodo-SOMA-RP-v1")
Simple text-to-motion generation
Simple text-to-motion generation
result = model(
prompts=["a person walks forward at a moderate pace"],
duration=4.0,
num_samples=1,
seed=42,
)
result = model(
prompts=["a person walks forward at a moderate pace"],
duration=4.0,
num_samples=1,
seed=42,
)
Result contains posed joints, rotation matrices, foot contacts
Result contains posed joints, rotation matrices, foot contacts
print(result["posed_joints"].shape) # [T, J, 3]
print(result["global_rot_mats"].shape) # [T, J, 3, 3]
print(result["local_rot_mats"].shape) # [T, J, 3, 3]
print(result["foot_contacts"].shape) # [T, 4]
print(result["root_positions"].shape) # [T, 3]
print(result["posed_joints"].shape) # [T, J, 3]
print(result["global_rot_mats"].shape) # [T, J, 3, 3]
print(result["local_rot_mats"].shape) # [T, J, 3, 3]
print(result["foot_contacts"].shape) # [T, 4]
print(result["root_positions"].shape) # [T, 3]
Advanced API with Guidance and Constraints
带引导与约束的高级API
python
from kimodo.model import Kimodo
import numpy as np
model = Kimodo(model_name="Kimodo-SOMA-RP-v1")
python
from kimodo.model import Kimodo
import numpy as np
model = Kimodo(model_name="Kimodo-SOMA-RP-v1")
Multi-prompt with classifier-free guidance control
Multi-prompt with classifier-free guidance control
result = model(
prompts=["a person stands", "a person walks forward", "a person sits"],
duration=9.0,
num_samples=3,
diffusion_steps=50,
guidance_scale=7.5, # classifier-free guidance weight
seed=0,
)
result = model(
prompts=["a person stands", "a person walks forward", "a person sits"],
duration=9.0,
num_samples=3,
diffusion_steps=50,
guidance_scale=7.5, # classifier-free guidance weight
seed=0,
)
Access per-sample results
Access per-sample results
for i in range(3):
joints = result["posed_joints"][i] # [T, J, 3]
print(f"Sample {i}: {joints.shape}")
for i in range(3):
joints = result["posed_joints"][i] # [T, J, 3]
print(f"Sample {i}: {joints.shape}")
Working with Constraints Programmatically
以编程方式使用约束
python
from kimodo.model import Kimodo
from kimodo.constraints import ConstraintSet, FullBodyKeyframe, EndEffectorConstraint
import numpy as np
model = Kimodo(model_name="Kimodo-SOMA-RP-v1")
python
from kimodo.model import Kimodo
from kimodo.constraints import ConstraintSet, FullBodyKeyframe, EndEffectorConstraint
import numpy as np
model = Kimodo(model_name="Kimodo-SOMA-RP-v1")
Create constraint set
Create constraint set
constraints = ConstraintSet()
constraints = ConstraintSet()
Add a full-body keyframe at frame 30 (1 second at 30fps)
Add a full-body keyframe at frame 30 (1 second at 30fps)
keyframe_pose: [J, 3] joint positions
keyframe_pose: [J, 3] joint positions
keyframe_pose = np.zeros((model.num_joints, 3)) # replace with actual pose
constraints.add_full_body_keyframe(frame=30, joint_positions=keyframe_pose)
keyframe_pose = np.zeros((model.num_joints, 3)) # replace with actual pose
constraints.add_full_body_keyframe(frame=30, joint_positions=keyframe_pose)
Add end-effector constraints for right hand
Add end-effector constraints for right hand
constraints.add_end_effector(
joint_name="right_hand",
frame_start=45,
frame_end=60,
position=np.array([0.5, 1.2, 0.3]), # [x, y, z] in meters
rotation=None, # optional rotation matrix [3,3]
)
constraints.add_end_effector(
joint_name="right_hand",
frame_start=45,
frame_end=60,
position=np.array([0.5, 1.2, 0.3]), # [x, y, z] in meters
rotation=None, # optional rotation matrix [3,3]
)
Add 2D waypoints for root path
Add 2D waypoints for root path
constraints.add_root_waypoints(
waypoints=np.array([[0, 0], [1, 0], [1, 1], [0, 1]]), # [N, 2] in meters
)
constraints.add_root_waypoints(
waypoints=np.array([[0, 0], [1, 0], [1, 1], [0, 1]]), # [N, 2] in meters
)
Generate with constraints
Generate with constraints
result = model(
prompts=["a person walks in a square"],
duration=6.0,
constraints=constraints,
num_samples=2,
)
result = model(
prompts=["a person walks in a square"],
duration=6.0,
constraints=constraints,
num_samples=2,
)
Loading and Using Saved Constraints
加载和使用已保存的约束
python
from kimodo.model import Kimodo
from kimodo.constraints import ConstraintSet
import json
model = Kimodo(model_name="Kimodo-SOMA-RP-v1")
python
from kimodo.model import Kimodo
from kimodo.constraints import ConstraintSet
import json
model = Kimodo(model_name="Kimodo-SOMA-RP-v1")
Load constraints saved from web demo
Load constraints saved from web demo
with open("constraints.json") as f:
constraint_data = json.load(f)
constraints = ConstraintSet.from_dict(constraint_data)
result = model(
prompts=["a person performs a choreographed sequence"],
duration=8.0,
constraints=constraints,
)
with open("constraints.json") as f:
constraint_data = json.load(f)
constraints = ConstraintSet.from_dict(constraint_data)
result = model(
prompts=["a person performs a choreographed sequence"],
duration=8.0,
constraints=constraints,
)
Saving and Loading Generated Motions
保存和加载生成的运动
result = model(prompts=["a person walks"], duration=4.0)
np.savez("walk_motion.npz", **result)
result = model(prompts=["a person walks"], duration=4.0)
np.savez("walk_motion.npz", **result)
Load and inspect saved motion
Load and inspect saved motion
data = np.load("walk_motion.npz")
posed_joints = data["posed_joints"] # [T, J, 3] global joint positions
global_rot_mats = data["global_rot_mats"] # [T, J, 3, 3]
local_rot_mats = data["local_rot_mats"] # [T, J, 3, 3]
foot_contacts = data["foot_contacts"] # [T, 4] [L-heel, L-toe, R-heel, R-toe]
root_positions = data["root_positions"] # [T, 3] actual root joint trajectory
smooth_root_pos = data["smooth_root_pos"] # [T, 3] smoothed root from model
global_root_heading = data["global_root_heading"] # [T, 2] heading direction
data = np.load("walk_motion.npz")
posed_joints = data["posed_joints"] # [T, J, 3] global joint positions
global_rot_mats = data["global_rot_mats"] # [T, J, 3, 3]
local_rot_mats = data["local_rot_mats"] # [T, J, 3, 3]
foot_contacts = data["foot_contacts"] # [T, 4] [L-heel, L-toe, R-heel, R-toe]
root_positions = data["root_positions"] # [T, 3] actual root joint trajectory
smooth_root_pos = data["smooth_root_pos"] # [T, 3] smoothed root from model
global_root_heading = data["global_root_heading"] # [T, 2] heading direction
Robotics Integration
机器人集成
MuJoCo Visualization (G1 Robot)
MuJoCo可视化(G1机器人)
Generate G1 motion and save as MuJoCo qpos CSV
Generate G1 motion and save as MuJoCo qpos CSV
kimodo_gen "a robot walks forward and waves"
--model Kimodo-G1-RP-v1
--output ./robot_walk.csv
--duration 5.0
kimodo_gen "a robot walks forward and waves"
--model Kimodo-G1-RP-v1
--output ./robot_walk.csv
--duration 5.0
Visualize in MuJoCo (edit script to point to your CSV)
Visualize in MuJoCo (edit script to point to your CSV)
python -m kimodo.scripts.mujoco_load
python -m kimodo.scripts.mujoco_load
mujoco_load.py customization pattern
mujoco_load.py customization pattern
import mujoco
import numpy as np
import mujoco
import numpy as np
Edit these paths in the script
Edit these paths in the script
CSV_PATH = "./robot_walk.csv"
MJCF_PATH = "./assets/g1/g1.xml" # path to G1 MuJoCo model
CSV_PATH = "./robot_walk.csv"
MJCF_PATH = "./assets/g1/g1.xml" # path to G1 MuJoCo model
Load qpos data
Load qpos data
qpos_data = np.loadtxt(CSV_PATH, delimiter=",")
qpos_data = np.loadtxt(CSV_PATH, delimiter=",")
Standard MuJoCo playback loop
Standard MuJoCo playback loop
model = mujoco.MjModel.from_xml_path(MJCF_PATH)
data = mujoco.MjData(model)
with mujoco.viewer.launch_passive(model, data) as viewer:
for frame_qpos in qpos_data:
data.qpos[:] = frame_qpos
mujoco.mj_forward(model, data)
viewer.sync()
model = mujoco.MjModel.from_xml_path(MJCF_PATH)
data = mujoco.MjData(model)
with mujoco.viewer.launch_passive(model, data) as viewer:
for frame_qpos in qpos_data:
data.qpos[:] = frame_qpos
mujoco.mj_forward(model, data)
viewer.sync()
ProtoMotions Integration
ProtoMotions集成
Generate motion with Kimodo
Generate motion with Kimodo
kimodo_gen "a person runs and jumps" --model Kimodo-SOMA-RP-v1
--output ./run_jump.npz --duration 5.0
kimodo_gen "a person runs and jumps" --model Kimodo-SOMA-RP-v1
--output ./run_jump.npz --duration 5.0
Then follow ProtoMotions docs to import:
Then follow ProtoMotions docs to import:
GMR Retargeting (SMPL-X to Other Robots)
GMR重定向(SMPL-X到其他机器人)
Generate SMPL-X motion (saves stem_amass.npz automatically)
Generate SMPL-X motion (saves stem_amass.npz automatically)
kimodo_gen "a person performs a cartwheel"
--model Kimodo-SMPLX-RP-v1
--output ./cartwheel.npz
kimodo_gen "a person performs a cartwheel"
--model Kimodo-SMPLX-RP-v1
--output ./cartwheel.npz
Use cartwheel_amass.npz with GMR for retargeting
Use cartwheel_amass.npz with GMR for retargeting
NPZ Output Format Reference
NPZ输出格式参考
| Key | Shape | Description |
|---|
| | Global joint positions in meters |
| | Global joint rotation matrices |
| | Parent-relative joint rotation matrices |
| | Contact labels: [L-heel, L-toe, R-heel, R-toe] |
| | Smoothed root trajectory from model |
| | Actual root joint (pelvis) trajectory |
| | Heading direction (2D unit vector) |
= number of frames (30fps),
= number of joints (skeleton-dependent)
| 键名 | 形状 | 描述 |
|---|
| | 以米为单位的全局关节位置 |
| | 全局关节旋转矩阵 |
| | 相对于父关节的旋转矩阵 |
| | 接触标签:[左脚跟, 左脚趾, 右脚跟, 右脚趾] |
| | 模型生成的平滑根节点轨迹 |
| | 根关节(骨盆)的实际轨迹 |
| | 朝向(2D单位向量) |
= 帧数(30fps),
= 关节数(取决于骨骼结构)
Direct script execution (alternative to CLI)
Direct script execution (alternative to CLI)
python scripts/generate.py "a person walks" --duration 4.0
python scripts/generate.py "a person walks" --duration 4.0
MuJoCo visualization for G1 outputs
MuJoCo visualization for G1 outputs
python -m kimodo.scripts.mujoco_load
python -m kimodo.scripts.mujoco_load
All kimodo_gen flags
All kimodo_gen flags
Batch Generation Pipeline
批量生成流水线
python
from kimodo.model import Kimodo
import numpy as np
from pathlib import Path
model = Kimodo(model_name="Kimodo-SOMA-RP-v1")
output_dir = Path("./batch_outputs")
output_dir.mkdir(exist_ok=True)
prompts = [
"a person walks forward",
"a person runs",
"a person jumps in place",
"a person sits down",
"a person picks up an object from the floor",
]
for i, prompt in enumerate(prompts):
result = model(
prompts=[prompt],
duration=4.0,
num_samples=1,
seed=i,
)
out_path = output_dir / f"motion_{i:03d}.npz"
np.savez(str(out_path), **result)
print(f"Saved: {out_path}")
python
from kimodo.model import Kimodo
import numpy as np
from pathlib import Path
model = Kimodo(model_name="Kimodo-SOMA-RP-v1")
output_dir = Path("./batch_outputs")
output_dir.mkdir(exist_ok=True)
prompts = [
"a person walks forward",
"a person runs",
"a person jumps in place",
"a person sits down",
"a person picks up an object from the floor",
]
for i, prompt in enumerate(prompts):
result = model(
prompts=[prompt],
duration=4.0,
num_samples=1,
seed=i,
)
out_path = output_dir / f"motion_{i:03d}.npz"
np.savez(str(out_path), **result)
print(f"Saved: {out_path}")
Comparing Model Variants
对比不同模型变体
python
from kimodo.model import Kimodo
import numpy as np
prompt = "a person walks forward"
models = ["Kimodo-SOMA-RP-v1", "Kimodo-SOMA-SEED-v1"]
results = {}
for model_name in models:
model = Kimodo(model_name=model_name)
results[model_name] = model(
prompts=[prompt],
duration=4.0,
seed=0,
)
print(f"{model_name}: joints shape = {results[model_name]['posed_joints'].shape}")
python
from kimodo.model import Kimodo
import numpy as np
prompt = "a person walks forward"
models = ["Kimodo-SOMA-RP-v1", "Kimodo-SOMA-SEED-v1"]
results = {}
for model_name in models:
model = Kimodo(model_name=model_name)
results[model_name] = model(
prompts=[prompt],
duration=4.0,
seed=0,
)
print(f"{model_name}: joints shape = {results[model_name]['posed_joints'].shape}")
Out of VRAM (~17GB required):
Check available VRAM
Check available VRAM
Use fewer samples to reduce peak VRAM
Use fewer samples to reduce peak VRAM
kimodo_gen "a person walks" --num_samples 1
kimodo_gen "a person walks" --num_samples 1
Reduce diffusion steps to speed up (less quality)
Reduce diffusion steps to speed up (less quality)
kimodo_gen "a person walks" --diffusion_steps 20
**Model download issues:**
```bash
kimodo_gen "a person walks" --diffusion_steps 20
Models download from Hugging Face automatically
Models download from Hugging Face automatically
If behind a proxy, set:
If behind a proxy, set:
Or manually specify cache directory
Or manually specify cache directory
export HF_HOME=/path/to/your/cache
**Motion quality issues:**
- Be specific in prompts: "a person walks forward at a moderate pace" > "walking"
- For complex motions, use the interactive demo to add keyframe constraints
- Increase `--diffusion_steps` (default ~20-30, try 50 for higher quality)
- Generate multiple samples (`--num_samples 5`) and select the best
- Avoid prompts with extremely fast or physically impossible actions
- The model operates at 30fps; very short durations (<1s) may yield poor results
**Foot skating artifacts:**
```bash
export HF_HOME=/path/to/your/cache
**运动质量问题:**
- 提示词要具体:“a person walks forward at a moderate pace” 优于 “walking”
- 对于复杂运动,使用交互式演示添加关键帧约束
- 增加`--diffusion_steps`(默认约20-30,尝试50以获得更高质量)
- 生成多个样本(`--num_samples 5`)并选择最佳结果
- 避免包含极快或物理上不可能的动作的提示词
- 模型以30fps运行;极短的时长(<1秒)可能导致结果不佳
**脚部滑动 artifacts:**
```bash
Post-processing is enabled by default; only disable for debugging
Post-processing is enabled by default; only disable for debugging
kimodo_gen "a person walks" # post-processing ON (default)
kimodo_gen "a person walks" --no-postprocess # post-processing OFF
**Interactive demo not loading:**
```bash
kimodo_gen "a person walks" # post-processing ON (default)
kimodo_gen "a person walks" --no-postprocess # post-processing OFF
Ensure port 7860 is available
Ensure port 7860 is available
Launch on a different port
Launch on a different port
kimodo_demo --server-port 7861
kimodo_demo --server-port 7861
For remote server access
For remote server access
kimodo_demo --server-name 0.0.0.0 --server-port 7860
kimodo_demo --server-name 0.0.0.0 --server-port 7860
Then use SSH port forwarding: ssh -L 7860:localhost:7860 user@server
Then use SSH port forwarding: ssh -L 7860:localhost:7860 user@server