hy-world-2-0-3d-world-model

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

HY-World 2.0 — 3D World Model Skill

HY-World 2.0 — 3D世界模型技能

Skill by ara.so — Daily 2026 Skills collection.

HY-World 2.0 is a multi-modal world model by Tencent Hunyuan that reconstructs, generates, and simulates 3D worlds. It accepts text, single-view images, multi-view images, and videos as input and produces 3D representations (meshes, 3D Gaussian Splattings, point clouds). Two core capabilities:

World Reconstruction (multi-view images / video → 3D): Powered by WorldMirror 2.0, a ~1.2B feed-forward model predicting depth, surface normals, camera parameters, 3D point clouds, and 3DGS attributes in a single forward pass.
World Generation (text / single image → 3D world): Four-stage pipeline — Panorama Generation (HY-Pano 2.0) → Trajectory Planning (WorldNav) → World Expansion (WorldStereo 2.0) → World Composition (WorldMirror 2.0 + 3DGS).

由ara.so提供的技能 — 2026每日技能合集。

HY-World 2.0是腾讯混元推出的多模态世界模型，可进行3D世界的重建、生成与模拟。它支持文本、单视角图像、多视角图像和视频作为输入，输出3D表示形式（网格、3D Gaussian Splattings、点云）。具备两大核心能力：

世界重建（多视角图像/视频→3D）：由WorldMirror 2.0驱动，这是一个约12亿参数的前馈模型，可在单次前向传播中预测深度、表面法线、相机参数、3D点云和3DGS属性。
世界生成（文本/单张图像→3D世界）：四阶段流程——全景图生成（HY-Pano 2.0）→轨迹规划（WorldNav）→世界扩展（WorldStereo 2.0）→世界合成（WorldMirror 2.0 + 3DGS）。

Installation

安装

Requirements

环境要求

Python 3.10
CUDA 12.4 (recommended)
PyTorch 2.4.0

bash

undefined

Python 3.10
CUDA 12.4（推荐）
PyTorch 2.4.0

bash

undefined

1. Clone repository

1. 克隆仓库

git clone https://github.com/Tencent-Hunyuan/HY-World-2.0 cd HY-World-2.0

2. Create conda environment

2. 创建conda环境

conda create -n hyworld2 python=3.10 conda activate hyworld2

3. Install PyTorch with CUDA 12.4

3. 安装带CUDA 12.4的PyTorch

pip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124

4. Install project dependencies

4. 安装项目依赖

pip install -r requirements.txt

5a. Install FlashAttention-3 (recommended for performance)

5a. 安装FlashAttention-3（推荐用于提升性能）

git clone https://github.com/Dao-AILab/flash-attention.git cd flash-attention/hopper python setup.py install cd ../../ rm -rf flash-attention

5b. OR install FlashAttention-2 (simpler)

5b. 或者安装FlashAttention-2（更简便）

pip install flash-attn --no-build-isolation

undefined

pip install flash-attn --no-build-isolation

undefined

Model Weights

模型权重

Model weights are automatically downloaded from Hugging Face on first run. Alternatively, download manually:

Model	HuggingFace
WorldMirror 2.0	`tencent/HY-World-2.0` → `HY-WorldMirror-2.0`
WorldMirror 1.0 (legacy)	`tencent/HunyuanWorld-Mirror`

To pre-download:

bash

undefined

首次运行时，模型权重会自动从Hugging Face下载。也可手动下载：

模型	HuggingFace地址
WorldMirror 2.0	`tencent/HY-World-2.0` → `HY-WorldMirror-2.0`
WorldMirror 1.0（旧版本）	`tencent/HunyuanWorld-Mirror`

预下载方法：

bash

undefined

Set HuggingFace cache directory if needed

如需设置HuggingFace缓存目录

export HF_HOME=/path/to/cache

pip install huggingface_hub python -c "from huggingface_hub import snapshot_download; snapshot_download('tencent/HY-World-2.0')"

---

export HF_HOME=/path/to/cache

pip install huggingface_hub python -c "from huggingface_hub import snapshot_download; snapshot_download('tencent/HY-World-2.0')"

---

Core API — WorldMirror 2.0 (World Reconstruction)

核心API — WorldMirror 2.0（世界重建）

Basic Usage

基础用法

python

from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

python

from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

Load pipeline — weights auto-downloaded on first run

加载流水线——首次运行时自动下载权重

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')

Run reconstruction from a folder of images

从图像文件夹运行重建

result = pipeline('path/to/images')

undefined

result = pipeline('path/to/images')

undefined

With Prior Injection (Camera & Depth)

注入先验信息（相机与深度）

Provide known camera parameters or depth priors to improve accuracy:

python

from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')

result = pipeline(
    'path/to/images',
    prior_cam_path='path/to/prior_camera.json',
    prior_depth_path='path/to/prior_depth.npy',  # optional
)

提供已知的相机参数或深度先验信息以提升精度：

python

from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')

result = pipeline(
    'path/to/images',
    prior_cam_path='path/to/prior_camera.json',
    prior_depth_path='path/to/prior_depth.npy',  # 可选
)

Camera JSON Format

相机JSON格式

The

prior_camera.json

format expected by the pipeline:

json

[
  {
    "image": "frame_001.jpg",
    "fx": 800.0,
    "fy": 800.0,
    "cx": 640.0,
    "cy": 360.0,
    "width": 1280,
    "height": 720,
    "c2w": [
      [1.0, 0.0, 0.0, 0.0],
      [0.0, 1.0, 0.0, 0.0],
      [0.0, 0.0, 1.0, 0.0],
      [0.0, 0.0, 0.0, 1.0]
    ]
  }
]

流水线期望的

prior_camera.json

格式：

json

[
  {
    "image": "frame_001.jpg",
    "fx": 800.0,
    "fy": 800.0,
    "cx": 640.0,
    "cy": 360.0,
    "width": 1280,
    "height": 720,
    "c2w": [
      [1.0, 0.0, 0.0, 0.0],
      [0.0, 1.0, 0.0, 0.0],
      [0.0, 0.0, 1.0, 0.0],
      [0.0, 0.0, 0.0, 1.0]
    ]
  }
]

Result Object

结果对象

The pipeline returns a result object with the following attributes:

python

result = pipeline('path/to/images')

流水线返回的结果对象包含以下属性：

python

result = pipeline('path/to/images')

Access outputs

访问输出结果

point_cloud = result.point_cloud # 3D point cloud (numpy or torch) depth_maps = result.depth_maps # Per-image depth maps normals = result.normals # Surface normal maps cameras = result.cameras # Predicted camera parameters gaussians = result.gaussians # 3DGS attributes

point_cloud = result.point_cloud # 3D点云（numpy或torch格式） depth_maps = result.depth_maps # 单图像深度图 normals = result.normals # 表面法线图 cameras = result.cameras # 预测的相机参数 gaussians = result.gaussians # 3DGS属性

Save outputs

保存输出结果

result.save('output_dir/') # Saves all outputs to directory

---

result.save('output_dir/') # 将所有输出保存到指定目录

---

Gradio App — WorldMirror 2.0

Gradio应用 — WorldMirror 2.0

Launch an interactive web UI for 3D reconstruction:

bash

undefined

启动交互式Web UI进行3D重建：

bash

undefined

From project root

从项目根目录运行

python -m hyworld2.worldrecon.app

Or if a dedicated script exists

如果有专用脚本

python app.py --model tencent/HY-World-2.0


Access at `http://localhost:7860` by default.

---

python app.py --model tencent/HY-World-2.0


默认访问地址为`http://localhost:7860`。

---

Common Patterns

常见使用场景

Pattern 1: Reconstruct from a Video

场景1：从视频重建3D世界

Extract frames from a video, then run reconstruction:

python

import cv2
import os
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

def extract_frames(video_path, output_dir, fps=2):
    os.makedirs(output_dir, exist_ok=True)
    cap = cv2.VideoCapture(video_path)
    video_fps = cap.get(cv2.CAP_PROP_FPS)
    frame_interval = int(video_fps / fps)
    frame_idx = 0
    saved = 0
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        if frame_idx % frame_interval == 0:
            cv2.imwrite(f"{output_dir}/frame_{saved:04d}.jpg", frame)
            saved += 1
        frame_idx += 1
    cap.release()
    return output_dir

从视频中提取帧，然后运行重建：

python

import cv2
import os
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

def extract_frames(video_path, output_dir, fps=2):
    os.makedirs(output_dir, exist_ok=True)
    cap = cv2.VideoCapture(video_path)
    video_fps = cap.get(cv2.CAP_PROP_FPS)
    frame_interval = int(video_fps / fps)
    frame_idx = 0
    saved = 0
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        if frame_idx % frame_interval == 0:
            cv2.imwrite(f"{output_dir}/frame_{saved:04d}.jpg", frame)
            saved += 1
        frame_idx += 1
    cap.release()
    return output_dir

Extract frames at 2 fps

以2帧/秒的速度提取帧

frames_dir = extract_frames("scene.mp4", "frames/", fps=2)

Run reconstruction

运行重建

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0') result = pipeline(frames_dir) result.save("output_3d/")

undefined

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0') result = pipeline(frames_dir) result.save("output_3d/")

undefined

Pattern 2: Flexible Resolution Inference

场景2：灵活调整分辨率推理

WorldMirror 2.0 supports 50K–500K pixel resolution. Control via resize parameters:

python

from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')

WorldMirror 2.0支持50K–500K像素分辨率，可通过resize参数控制：

python

from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')

Low resolution (fast, lower memory)

低分辨率（速度快，内存占用低）

result_fast = pipeline( 'path/to/images', resolution=512, # resize shorter edge to 512 )

result_fast = pipeline( 'path/to/images', resolution=512, # 将短边调整为512 )

High resolution (slower, more detail)

高分辨率（速度慢，细节更丰富）

result_hq = pipeline( 'path/to/images', resolution=1024, )

undefined

result_hq = pipeline( 'path/to/images', resolution=1024, )

undefined

Pattern 3: Batch Processing Multiple Scenes

场景3：批量处理多个场景

python

import os
from pathlib import Path
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')

scenes_root = Path("scenes/")
output_root = Path("outputs/")

for scene_dir in sorted(scenes_root.iterdir()):
    if not scene_dir.is_dir():
        continue
    out_dir = output_root / scene_dir.name
    out_dir.mkdir(parents=True, exist_ok=True)
    
    print(f"Processing: {scene_dir.name}")
    try:
        result = pipeline(str(scene_dir))
        result.save(str(out_dir))
        print(f"  Saved to {out_dir}")
    except Exception as e:
        print(f"  Failed: {e}")

python

import os
from pathlib import Path
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')

scenes_root = Path("scenes/")
output_root = Path("outputs/")

for scene_dir in sorted(scenes_root.iterdir()):
    if not scene_dir.is_dir():
        continue
    out_dir = output_root / scene_dir.name
    out_dir.mkdir(parents=True, exist_ok=True)
    
    print(f"处理中: {scene_dir.name}")
    try:
        result = pipeline(str(scene_dir))
        result.save(str(out_dir))
        print(f"  已保存至 {out_dir}")
    except Exception as e:
        print(f"  处理失败: {e}")

Pattern 4: Export to Common 3D Formats

场景4：导出为通用3D格式

After reconstruction, export to formats compatible with Blender / Unity / Unreal:

python

from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline('path/to/images')

重建完成后，导出为兼容Blender/Unity/Unreal的格式：

python

from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline('path/to/images')

Save 3DGS (.ply format for tools like 3D Gaussian Splatting viewer)

保存3DGS（.ply格式，适用于3D Gaussian Splatting查看器等工具）

result.save_gaussians("scene.ply")

Save mesh (if mesh export is supported)

保存网格（如果支持网格导出）

result.save_mesh("scene.obj") # or scene.glb

result.save_mesh("scene.obj") # 或scene.glb

Save point cloud

保存点云

result.save_pointcloud("scene_pointcloud.ply")

undefined

result.save_pointcloud("scene_pointcloud.ply")

undefined

Pattern 5: GPU Memory Management

场景5：GPU内存管理

For large scenes or limited VRAM:

python

import torch
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

针对大型场景或显存有限的情况：

python

import torch
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

Load in fp16 to reduce memory

以fp16精度加载模型以减少内存占用

pipeline = WorldMirrorPipeline.from_pretrained( 'tencent/HY-World-2.0', torch_dtype=torch.float16, ) pipeline = pipeline.to('cuda')

Run with lower resolution to fit in memory

以较低分辨率运行以适配显存

result = pipeline('path/to/images', resolution=768)

Free memory after use

使用后释放内存

del result torch.cuda.empty_cache()

---

del result torch.cuda.empty_cache()

---

Project Structure

项目结构

HY-World-2.0/
├── hyworld2/
│   ├── worldrecon/          # WorldMirror 2.0 reconstruction
│   │   ├── pipeline.py      # Main WorldMirrorPipeline class
│   │   ├── app.py           # Gradio web app
│   │   └── ...
│   ├── worldgen/            # World generation (coming soon)
│   │   ├── panorama/        # HY-Pano 2.0
│   │   ├── nav/             # WorldNav trajectory planning
│   │   └── stereo/          # WorldStereo 2.0
│   └── utils/
├── assets/                  # Demo assets
├── requirements.txt
└── README.md

HY-World-2.0/
├── hyworld2/
│   ├── worldrecon/          # WorldMirror 2.0重建模块
│   │   ├── pipeline.py      # 核心WorldMirrorPipeline类
│   │   ├── app.py           # Gradio Web应用
│   │   └── ...
│   ├── worldgen/            # 世界生成模块（即将推出）
│   │   ├── panorama/        # HY-Pano 2.0
│   │   ├── nav/             # WorldNav轨迹规划
│   │   └── stereo/          # WorldStereo 2.0
│   └── utils/
├── assets/                  # 演示资源
├── requirements.txt
└── README.md

Environment Variables

环境变量

bash

undefined

bash

undefined

HuggingFace model cache location

HuggingFace模型缓存位置

export HF_HOME=/path/to/hf/cache

HuggingFace token (if accessing private/gated models)

HuggingFace令牌（访问私有/ gated模型时需要）

export HUGGING_FACE_HUB_TOKEN=your_token_here

CUDA device selection

CUDA设备选择

export CUDA_VISIBLE_DEVICES=0

For multi-GPU setups

多GPU设置

export CUDA_VISIBLE_DEVICES=0,1

---

export CUDA_VISIBLE_DEVICES=0,1

---

Troubleshooting

故障排查

FlashAttention installation fails

FlashAttention安装失败

bash

undefined

bash

undefined

Use FlashAttention-2 as fallback

使用FlashAttention-2作为备选

pip install flash-attn --no-build-isolation

If that fails, disable flash attention (slower but works)

如果仍失败，禁用FlashAttention（速度较慢但可用）

Set environment variable before running

运行前设置环境变量

export USE_FLASH_ATTENTION=0

undefined

export USE_FLASH_ATTENTION=0

undefined

CUDA out of memory

CUDA显存不足

python

undefined

python

undefined

1. Reduce resolution

1. 降低分辨率

result = pipeline('path/to/images', resolution=512)

2. Use fp16

2. 使用fp16精度

pipeline = WorldMirrorPipeline.from_pretrained( 'tencent/HY-World-2.0', torch_dtype=torch.float16 )

3. Process fewer images at once — use a subset

3. 减少单次处理的图像数量——使用子集

import os images = sorted(os.listdir('path/to/images'))[:10] # limit to 10 frames

undefined

import os images = sorted(os.listdir('path/to/images'))[:10] # 限制为10帧

undefined

Model download issues

模型下载失败

bash

undefined

bash

undefined

Use HF mirror if huggingface.co is blocked

如果huggingface.co无法访问，使用HF镜像

export HF_ENDPOINT=https://hf-mirror.com

Or manually download and point to local path

或者手动下载并指向本地路径

pipeline = WorldMirrorPipeline.from_pretrained('/local/path/to/model')

undefined

pipeline = WorldMirrorPipeline.from_pretrained('/local/path/to/model')

undefined

Wrong PyTorch/CUDA version

PyTorch/CUDA版本不匹配

bash

undefined

bash

undefined

Verify versions match

验证版本是否匹配

python -c "import torch; print(torch.version, torch.version.cuda)"

Should output: 2.4.0 12.4

应输出：2.4.0 12.4

Reinstall if mismatch

如果不匹配则重新安装

pip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124

undefined

pip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124

undefined

Images not loading

图像无法加载

python

undefined

python

undefined

Ensure images are valid and in supported formats (.jpg, .png)

确保图像有效且为支持的格式（.jpg, .png）

from PIL import Image import os

img_dir = 'path/to/images' for f in os.listdir(img_dir): try: img = Image.open(os.path.join(img_dir, f)) img.verify() except Exception as e: print(f"Bad image {f}: {e}")

---

from PIL import Image import os

img_dir = 'path/to/images' for f in os.listdir(img_dir): try: img = Image.open(os.path.join(img_dir, f)) img.verify() except Exception as e: print(f"无效图像 {f}: {e}")

---

Related Projects

Project	Use Case	Link
WorldStereo	Panorama → 3DGS (open-source preview of WorldStereo-2)	GitHub
HunyuanWorld 1.0	Panorama generation (interim for HY-Pano 2.0)	GitHub
WorldMirror 1.0	Legacy reconstruction model	HuggingFace

项目	使用场景	链接
WorldStereo	全景图→3DGS（WorldStereo-2的开源预览版）	GitHub
HunyuanWorld 1.0	全景图生成（HY-Pano 2.0的过渡方案）	GitHub
WorldMirror 1.0	旧版重建模型	HuggingFace

Key Limitations (Current Release)

当前版本关键限制

World Generation pipeline (WorldNav, WorldStereo-2, HY-Pano-2) is not yet open-sourced — only WorldMirror 2.0 reconstruction is available.
Panorama generation: Use HunyuanWorld 1.0 as interim.
World Expansion: Use WorldStereo as interim.
Requires CUDA GPU — CPU inference not officially supported.
Minimum ~8GB VRAM recommended; 16GB+ for full-resolution inference.

世界生成流水线（WorldNav、WorldStereo-2、HY-Pano-2）尚未开源——仅开放WorldMirror 2.0重建功能。
全景图生成：可使用HunyuanWorld 1.0作为过渡方案。
世界扩展：可使用WorldStereo作为过渡方案。
需要CUDA GPU——官方不支持CPU推理。
推荐最低显存为8GB；全分辨率推理建议16GB及以上显存。