hy-world-2-0-3d-world-model
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHY-World 2.0 — 3D World Model Skill
HY-World 2.0 — 3D世界模型技能
Skill by ara.so — Daily 2026 Skills collection.
HY-World 2.0 is a multi-modal world model by Tencent Hunyuan that reconstructs, generates, and simulates 3D worlds. It accepts text, single-view images, multi-view images, and videos as input and produces 3D representations (meshes, 3D Gaussian Splattings, point clouds). Two core capabilities:
- World Reconstruction (multi-view images / video → 3D): Powered by WorldMirror 2.0, a ~1.2B feed-forward model predicting depth, surface normals, camera parameters, 3D point clouds, and 3DGS attributes in a single forward pass.
- World Generation (text / single image → 3D world): Four-stage pipeline — Panorama Generation (HY-Pano 2.0) → Trajectory Planning (WorldNav) → World Expansion (WorldStereo 2.0) → World Composition (WorldMirror 2.0 + 3DGS).
由ara.so提供的技能 — 2026每日技能合集。
HY-World 2.0是腾讯混元推出的多模态世界模型,可进行3D世界的重建、生成与模拟。它支持文本、单视角图像、多视角图像和视频作为输入,输出3D表示形式(网格、3D Gaussian Splattings、点云)。具备两大核心能力:
- 世界重建(多视角图像/视频→3D):由WorldMirror 2.0驱动,这是一个约12亿参数的前馈模型,可在单次前向传播中预测深度、表面法线、相机参数、3D点云和3DGS属性。
- 世界生成(文本/单张图像→3D世界):四阶段流程——全景图生成(HY-Pano 2.0)→轨迹规划(WorldNav)→世界扩展(WorldStereo 2.0)→世界合成(WorldMirror 2.0 + 3DGS)。
Installation
安装
Requirements
环境要求
- Python 3.10
- CUDA 12.4 (recommended)
- PyTorch 2.4.0
bash
undefined- Python 3.10
- CUDA 12.4(推荐)
- PyTorch 2.4.0
bash
undefined1. Clone repository
1. 克隆仓库
git clone https://github.com/Tencent-Hunyuan/HY-World-2.0
cd HY-World-2.0
git clone https://github.com/Tencent-Hunyuan/HY-World-2.0
cd HY-World-2.0
2. Create conda environment
2. 创建conda环境
conda create -n hyworld2 python=3.10
conda activate hyworld2
conda create -n hyworld2 python=3.10
conda activate hyworld2
3. Install PyTorch with CUDA 12.4
3. 安装带CUDA 12.4的PyTorch
pip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124
pip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124
4. Install project dependencies
4. 安装项目依赖
pip install -r requirements.txt
pip install -r requirements.txt
5a. Install FlashAttention-3 (recommended for performance)
5a. 安装FlashAttention-3(推荐用于提升性能)
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention/hopper
python setup.py install
cd ../../
rm -rf flash-attention
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention/hopper
python setup.py install
cd ../../
rm -rf flash-attention
5b. OR install FlashAttention-2 (simpler)
5b. 或者安装FlashAttention-2(更简便)
pip install flash-attn --no-build-isolation
undefinedpip install flash-attn --no-build-isolation
undefinedModel Weights
模型权重
Model weights are automatically downloaded from Hugging Face on first run. Alternatively, download manually:
| Model | HuggingFace |
|---|---|
| WorldMirror 2.0 | |
| WorldMirror 1.0 (legacy) | |
To pre-download:
bash
undefined首次运行时,模型权重会自动从Hugging Face下载。也可手动下载:
| 模型 | HuggingFace地址 |
|---|---|
| WorldMirror 2.0 | |
| WorldMirror 1.0(旧版本) | |
预下载方法:
bash
undefinedSet HuggingFace cache directory if needed
如需设置HuggingFace缓存目录
export HF_HOME=/path/to/cache
pip install huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download('tencent/HY-World-2.0')"
---export HF_HOME=/path/to/cache
pip install huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download('tencent/HY-World-2.0')"
---Core API — WorldMirror 2.0 (World Reconstruction)
核心API — WorldMirror 2.0(世界重建)
Basic Usage
基础用法
python
from hyworld2.worldrecon.pipeline import WorldMirrorPipelinepython
from hyworld2.worldrecon.pipeline import WorldMirrorPipelineLoad pipeline — weights auto-downloaded on first run
加载流水线——首次运行时自动下载权重
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
Run reconstruction from a folder of images
从图像文件夹运行重建
result = pipeline('path/to/images')
undefinedresult = pipeline('path/to/images')
undefinedWith Prior Injection (Camera & Depth)
注入先验信息(相机与深度)
Provide known camera parameters or depth priors to improve accuracy:
python
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline(
'path/to/images',
prior_cam_path='path/to/prior_camera.json',
prior_depth_path='path/to/prior_depth.npy', # optional
)提供已知的相机参数或深度先验信息以提升精度:
python
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline(
'path/to/images',
prior_cam_path='path/to/prior_camera.json',
prior_depth_path='path/to/prior_depth.npy', # 可选
)Camera JSON Format
相机JSON格式
The format expected by the pipeline:
prior_camera.jsonjson
[
{
"image": "frame_001.jpg",
"fx": 800.0,
"fy": 800.0,
"cx": 640.0,
"cy": 360.0,
"width": 1280,
"height": 720,
"c2w": [
[1.0, 0.0, 0.0, 0.0],
[0.0, 1.0, 0.0, 0.0],
[0.0, 0.0, 1.0, 0.0],
[0.0, 0.0, 0.0, 1.0]
]
}
]流水线期望的格式:
prior_camera.jsonjson
[
{
"image": "frame_001.jpg",
"fx": 800.0,
"fy": 800.0,
"cx": 640.0,
"cy": 360.0,
"width": 1280,
"height": 720,
"c2w": [
[1.0, 0.0, 0.0, 0.0],
[0.0, 1.0, 0.0, 0.0],
[0.0, 0.0, 1.0, 0.0],
[0.0, 0.0, 0.0, 1.0]
]
}
]Result Object
结果对象
The pipeline returns a result object with the following attributes:
python
result = pipeline('path/to/images')流水线返回的结果对象包含以下属性:
python
result = pipeline('path/to/images')Access outputs
访问输出结果
point_cloud = result.point_cloud # 3D point cloud (numpy or torch)
depth_maps = result.depth_maps # Per-image depth maps
normals = result.normals # Surface normal maps
cameras = result.cameras # Predicted camera parameters
gaussians = result.gaussians # 3DGS attributes
point_cloud = result.point_cloud # 3D点云(numpy或torch格式)
depth_maps = result.depth_maps # 单图像深度图
normals = result.normals # 表面法线图
cameras = result.cameras # 预测的相机参数
gaussians = result.gaussians # 3DGS属性
Save outputs
保存输出结果
result.save('output_dir/') # Saves all outputs to directory
---result.save('output_dir/') # 将所有输出保存到指定目录
---Gradio App — WorldMirror 2.0
Gradio应用 — WorldMirror 2.0
Launch an interactive web UI for 3D reconstruction:
bash
undefined启动交互式Web UI进行3D重建:
bash
undefinedFrom project root
从项目根目录运行
python -m hyworld2.worldrecon.app
python -m hyworld2.worldrecon.app
Or if a dedicated script exists
如果有专用脚本
python app.py --model tencent/HY-World-2.0
Access at `http://localhost:7860` by default.
---python app.py --model tencent/HY-World-2.0
默认访问地址为`http://localhost:7860`。
---Common Patterns
常见使用场景
Pattern 1: Reconstruct from a Video
场景1:从视频重建3D世界
Extract frames from a video, then run reconstruction:
python
import cv2
import os
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
def extract_frames(video_path, output_dir, fps=2):
os.makedirs(output_dir, exist_ok=True)
cap = cv2.VideoCapture(video_path)
video_fps = cap.get(cv2.CAP_PROP_FPS)
frame_interval = int(video_fps / fps)
frame_idx = 0
saved = 0
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
if frame_idx % frame_interval == 0:
cv2.imwrite(f"{output_dir}/frame_{saved:04d}.jpg", frame)
saved += 1
frame_idx += 1
cap.release()
return output_dir从视频中提取帧,然后运行重建:
python
import cv2
import os
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
def extract_frames(video_path, output_dir, fps=2):
os.makedirs(output_dir, exist_ok=True)
cap = cv2.VideoCapture(video_path)
video_fps = cap.get(cv2.CAP_PROP_FPS)
frame_interval = int(video_fps / fps)
frame_idx = 0
saved = 0
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
if frame_idx % frame_interval == 0:
cv2.imwrite(f"{output_dir}/frame_{saved:04d}.jpg", frame)
saved += 1
frame_idx += 1
cap.release()
return output_dirExtract frames at 2 fps
以2帧/秒的速度提取帧
frames_dir = extract_frames("scene.mp4", "frames/", fps=2)
frames_dir = extract_frames("scene.mp4", "frames/", fps=2)
Run reconstruction
运行重建
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline(frames_dir)
result.save("output_3d/")
undefinedpipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline(frames_dir)
result.save("output_3d/")
undefinedPattern 2: Flexible Resolution Inference
场景2:灵活调整分辨率推理
WorldMirror 2.0 supports 50K–500K pixel resolution. Control via resize parameters:
python
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')WorldMirror 2.0支持50K–500K像素分辨率,可通过resize参数控制:
python
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')Low resolution (fast, lower memory)
低分辨率(速度快,内存占用低)
result_fast = pipeline(
'path/to/images',
resolution=512, # resize shorter edge to 512
)
result_fast = pipeline(
'path/to/images',
resolution=512, # 将短边调整为512
)
High resolution (slower, more detail)
高分辨率(速度慢,细节更丰富)
result_hq = pipeline(
'path/to/images',
resolution=1024,
)
undefinedresult_hq = pipeline(
'path/to/images',
resolution=1024,
)
undefinedPattern 3: Batch Processing Multiple Scenes
场景3:批量处理多个场景
python
import os
from pathlib import Path
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
scenes_root = Path("scenes/")
output_root = Path("outputs/")
for scene_dir in sorted(scenes_root.iterdir()):
if not scene_dir.is_dir():
continue
out_dir = output_root / scene_dir.name
out_dir.mkdir(parents=True, exist_ok=True)
print(f"Processing: {scene_dir.name}")
try:
result = pipeline(str(scene_dir))
result.save(str(out_dir))
print(f" Saved to {out_dir}")
except Exception as e:
print(f" Failed: {e}")python
import os
from pathlib import Path
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
scenes_root = Path("scenes/")
output_root = Path("outputs/")
for scene_dir in sorted(scenes_root.iterdir()):
if not scene_dir.is_dir():
continue
out_dir = output_root / scene_dir.name
out_dir.mkdir(parents=True, exist_ok=True)
print(f"处理中: {scene_dir.name}")
try:
result = pipeline(str(scene_dir))
result.save(str(out_dir))
print(f" 已保存至 {out_dir}")
except Exception as e:
print(f" 处理失败: {e}")Pattern 4: Export to Common 3D Formats
场景4:导出为通用3D格式
After reconstruction, export to formats compatible with Blender / Unity / Unreal:
python
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline('path/to/images')重建完成后,导出为兼容Blender/Unity/Unreal的格式:
python
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline('path/to/images')Save 3DGS (.ply format for tools like 3D Gaussian Splatting viewer)
保存3DGS(.ply格式,适用于3D Gaussian Splatting查看器等工具)
result.save_gaussians("scene.ply")
result.save_gaussians("scene.ply")
Save mesh (if mesh export is supported)
保存网格(如果支持网格导出)
result.save_mesh("scene.obj") # or scene.glb
result.save_mesh("scene.obj") # 或scene.glb
Save point cloud
保存点云
result.save_pointcloud("scene_pointcloud.ply")
undefinedresult.save_pointcloud("scene_pointcloud.ply")
undefinedPattern 5: GPU Memory Management
场景5:GPU内存管理
For large scenes or limited VRAM:
python
import torch
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline针对大型场景或显存有限的情况:
python
import torch
from hyworld2.worldrecon.pipeline import WorldMirrorPipelineLoad in fp16 to reduce memory
以fp16精度加载模型以减少内存占用
pipeline = WorldMirrorPipeline.from_pretrained(
'tencent/HY-World-2.0',
torch_dtype=torch.float16,
)
pipeline = pipeline.to('cuda')
pipeline = WorldMirrorPipeline.from_pretrained(
'tencent/HY-World-2.0',
torch_dtype=torch.float16,
)
pipeline = pipeline.to('cuda')
Run with lower resolution to fit in memory
以较低分辨率运行以适配显存
result = pipeline('path/to/images', resolution=768)
result = pipeline('path/to/images', resolution=768)
Free memory after use
使用后释放内存
del result
torch.cuda.empty_cache()
---del result
torch.cuda.empty_cache()
---Project Structure
项目结构
HY-World-2.0/
├── hyworld2/
│ ├── worldrecon/ # WorldMirror 2.0 reconstruction
│ │ ├── pipeline.py # Main WorldMirrorPipeline class
│ │ ├── app.py # Gradio web app
│ │ └── ...
│ ├── worldgen/ # World generation (coming soon)
│ │ ├── panorama/ # HY-Pano 2.0
│ │ ├── nav/ # WorldNav trajectory planning
│ │ └── stereo/ # WorldStereo 2.0
│ └── utils/
├── assets/ # Demo assets
├── requirements.txt
└── README.mdHY-World-2.0/
├── hyworld2/
│ ├── worldrecon/ # WorldMirror 2.0重建模块
│ │ ├── pipeline.py # 核心WorldMirrorPipeline类
│ │ ├── app.py # Gradio Web应用
│ │ └── ...
│ ├── worldgen/ # 世界生成模块(即将推出)
│ │ ├── panorama/ # HY-Pano 2.0
│ │ ├── nav/ # WorldNav轨迹规划
│ │ └── stereo/ # WorldStereo 2.0
│ └── utils/
├── assets/ # 演示资源
├── requirements.txt
└── README.mdEnvironment Variables
环境变量
bash
undefinedbash
undefinedHuggingFace model cache location
HuggingFace模型缓存位置
export HF_HOME=/path/to/hf/cache
export HF_HOME=/path/to/hf/cache
HuggingFace token (if accessing private/gated models)
HuggingFace令牌(访问私有/ gated模型时需要)
export HUGGING_FACE_HUB_TOKEN=your_token_here
export HUGGING_FACE_HUB_TOKEN=your_token_here
CUDA device selection
CUDA设备选择
export CUDA_VISIBLE_DEVICES=0
export CUDA_VISIBLE_DEVICES=0
For multi-GPU setups
多GPU设置
export CUDA_VISIBLE_DEVICES=0,1
---export CUDA_VISIBLE_DEVICES=0,1
---Troubleshooting
故障排查
FlashAttention installation fails
FlashAttention安装失败
bash
undefinedbash
undefinedUse FlashAttention-2 as fallback
使用FlashAttention-2作为备选
pip install flash-attn --no-build-isolation
pip install flash-attn --no-build-isolation
If that fails, disable flash attention (slower but works)
如果仍失败,禁用FlashAttention(速度较慢但可用)
Set environment variable before running
运行前设置环境变量
export USE_FLASH_ATTENTION=0
undefinedexport USE_FLASH_ATTENTION=0
undefinedCUDA out of memory
CUDA显存不足
python
undefinedpython
undefined1. Reduce resolution
1. 降低分辨率
result = pipeline('path/to/images', resolution=512)
result = pipeline('path/to/images', resolution=512)
2. Use fp16
2. 使用fp16精度
pipeline = WorldMirrorPipeline.from_pretrained(
'tencent/HY-World-2.0',
torch_dtype=torch.float16
)
pipeline = WorldMirrorPipeline.from_pretrained(
'tencent/HY-World-2.0',
torch_dtype=torch.float16
)
3. Process fewer images at once — use a subset
3. 减少单次处理的图像数量——使用子集
import os
images = sorted(os.listdir('path/to/images'))[:10] # limit to 10 frames
undefinedimport os
images = sorted(os.listdir('path/to/images'))[:10] # 限制为10帧
undefinedModel download issues
模型下载失败
bash
undefinedbash
undefinedUse HF mirror if huggingface.co is blocked
如果huggingface.co无法访问,使用HF镜像
export HF_ENDPOINT=https://hf-mirror.com
export HF_ENDPOINT=https://hf-mirror.com
Or manually download and point to local path
或者手动下载并指向本地路径
pipeline = WorldMirrorPipeline.from_pretrained('/local/path/to/model')
undefinedpipeline = WorldMirrorPipeline.from_pretrained('/local/path/to/model')
undefinedWrong PyTorch/CUDA version
PyTorch/CUDA版本不匹配
bash
undefinedbash
undefinedVerify versions match
验证版本是否匹配
python -c "import torch; print(torch.version, torch.version.cuda)"
python -c "import torch; print(torch.version, torch.version.cuda)"
Should output: 2.4.0 12.4
应输出:2.4.0 12.4
Reinstall if mismatch
如果不匹配则重新安装
pip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124
undefinedpip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124
undefinedImages not loading
图像无法加载
python
undefinedpython
undefinedEnsure images are valid and in supported formats (.jpg, .png)
确保图像有效且为支持的格式(.jpg, .png)
from PIL import Image
import os
img_dir = 'path/to/images'
for f in os.listdir(img_dir):
try:
img = Image.open(os.path.join(img_dir, f))
img.verify()
except Exception as e:
print(f"Bad image {f}: {e}")
---from PIL import Image
import os
img_dir = 'path/to/images'
for f in os.listdir(img_dir):
try:
img = Image.open(os.path.join(img_dir, f))
img.verify()
except Exception as e:
print(f"无效图像 {f}: {e}")
---Related Projects
相关项目
| Project | Use Case | Link |
|---|---|---|
| WorldStereo | Panorama → 3DGS (open-source preview of WorldStereo-2) | GitHub |
| HunyuanWorld 1.0 | Panorama generation (interim for HY-Pano 2.0) | GitHub |
| WorldMirror 1.0 | Legacy reconstruction model | HuggingFace |
| 项目 | 使用场景 | 链接 |
|---|---|---|
| WorldStereo | 全景图→3DGS(WorldStereo-2的开源预览版) | GitHub |
| HunyuanWorld 1.0 | 全景图生成(HY-Pano 2.0的过渡方案) | GitHub |
| WorldMirror 1.0 | 旧版重建模型 | HuggingFace |
Key Limitations (Current Release)
当前版本关键限制
- World Generation pipeline (WorldNav, WorldStereo-2, HY-Pano-2) is not yet open-sourced — only WorldMirror 2.0 reconstruction is available.
- Panorama generation: Use HunyuanWorld 1.0 as interim.
- World Expansion: Use WorldStereo as interim.
- Requires CUDA GPU — CPU inference not officially supported.
- Minimum ~8GB VRAM recommended; 16GB+ for full-resolution inference.
- 世界生成流水线(WorldNav、WorldStereo-2、HY-Pano-2)尚未开源——仅开放WorldMirror 2.0重建功能。
- 全景图生成:可使用HunyuanWorld 1.0作为过渡方案。
- 世界扩展:可使用WorldStereo作为过渡方案。
- 需要CUDA GPU——官方不支持CPU推理。
- 推荐最低显存为8GB;全分辨率推理建议16GB及以上显存。