WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

Abstract

WonderPlay is a novel framework integrating physics simulation with video generation for generating action-conditioned dynamic 3D scenes from a single image. While prior works are restricted to rigid body or simple elastic dynamics, WonderPlay features a hybrid generative simulator to synthesize a wide range of 3D dynamics. The hybrid generative simulator first uses a physics solver to simulate coarse 3D dynamics, which subsequently conditions a video generator to produce a video with finer, more realistic motion. The generated video is then used to update the simulated dynamic 3D scene, closing the loop between the physics solver and the video generator. This approach enables intuitive user control to be combined with the accurate dynamics of physics-based simulators and the expressivity of diffusion-based video generators. Experimental results demonstrate that WonderPlay enables users to interact with various scenes of diverse content, including cloth, sand, snow, liquid, smoke, elastic, and rigid bodies -- all using a single image input.

News

Please also check our latest follow-up work PerpetualWonder, build upon same motivation of combining physical simulation and video generation, but with more consistent 4D quality and long-horizon actions.

Getting Started

For the installation to be done correctly, please proceed only with CUDA-compatible GPU available. It requires 80GB GPU memory to run.

Tested Environment:

PyTorch: 2.7.1+cu126
CUDA: 12.6

Installation

Create and activate the main environment:

conda create -n wp python=3.10
conda activate wp

Install Python dependencies:

pip install -r requirements.txt

Install submodules:

cd submodules/depth_diff_gaussian_rasterization_min
pip install -e . --no-build-isolation
cd ../diff-gaussian-rasterization
pip install -e . --no-build-isolation
cd ../..

Install Genesis (a specific version with manual modifications):

cd Genesis
pip install -e .
cd ..

Install simpe_knn:

git submodule update --init --recursive submodules/simple_knn
cd submodules/simple_knn
pip install -e . --no-build-isolation
cd ../..

Install RepViT

git submodule update --init --recursive submodules/RepViT
mkdir -p submodules/RepViT/checkpoints
wget https://github.com/THU-MIG/RepViT/releases/download/v1.0/repvit_sam.pt -P submodules/RepViT/checkpoints/
cd submodules/RepViT/sam
pip install -e .
cd ../../..

Install Pytorch3D

pip install -U fvcore
pip install iopath
pip install --no-build-isolation "git+https://github.com/facebookresearch/pytorch3d.git@stable"

Install Instantmesh-related

pip install --no-build-isolation git+https://github.com/NVlabs/nvdiffrast/
pip install xatlas

Run examples

We currently provide one example cases: venice. To run on one case, the pipeline is consist of two main stages:

Stage 1: Scene Reconstruction + Physical Simulation

This stage we perform scene compositional reconstruction and run physical simulation:

conda activate wp
python WonderPlay_new/run_genesis.py --config examples/configs/venice.yaml --prefix example

Stage 2: Video model refinement

After completing the scene reconstruction, and simulation, we condition an existing flow-conditioned video model with simulation intermediate results:

python WonderPlay_new/run_video_model.py --input_folder 3d_result/wonderplay/venice/example/simulation --output_folder 3d_result/wonderplay/venice/example/output_video --sdedit_strengths 0.85

What's in one example?

To add a new example, follow these steps:

Configuration file: under examples/configs/ we provide the config file for running scene reconstruction and simulation.
Intermediate Results: under examples/imgs/ we also provide the intermediate results generated in the scene reconstruction stage, and the reconstruction pipeline will directly leverage these results if exist, including inpainted image, object mask, inpainted sky image and gaussians, etc.

Acknowledgement

Our code references and builds upon the following open-source projects:

HUGS: Optical flow rendering from gaussians
Genesis: Multi-physics simulation framework
simple_knn: For efficient cuda initialization
Go-With-The-Flow: Optical-flow conditioned video generation model

We are grateful to the authors and contributors of these projects for their valuable work.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Genesis		Genesis
WonderPlay_new		WonderPlay_new
assets		assets
examples		examples
submodules		submodules
video_models		video_models
zero123plus		zero123plus
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

Abstract

News

Getting Started

Installation

Run examples

Stage 1: Scene Reconstruction + Physical Simulation

Stage 2: Video model refinement

What's in one example?

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

Abstract

News

Getting Started

Installation

Run examples

Stage 1: Scene Reconstruction + Physical Simulation

Stage 2: Video model refinement

What's in one example?

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages