SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding [CVPR 2025]

We introduce SoundVista: a neural network pipeline to generate the ambient sound of arbitrary scene at novel viewpoints, without requiring any constraint or prior knowledge of sound source details.

Real Demo

Please watch with your headphones or speaker that supports binaural audio!

👉 Click here to watch the demo video

Dataset

SoundSpace-Ambient Matterport3D

data folder structure mp3d

sim_scenes: pano rgb-d pkl. Render sound-spaces, ref scripts/demo/mp3d_continuous_pano_render.py
benchmark_pkl
metadata sound-spaces
sim_audios
sounds sound-spaces
- 1s_all
- semantic_splits
acoustic_params echo t60 npy
binaural_rirs
ambisonic_rirs
benchmark index files: mp3d_mulv3_sparse_new.pkl (train)
budget number: ref_sampler_budget.pkl

Run the Code

Environment Setup

Compile sound-spaces first to render the SoundSpace-Ambient Matterport data.

Visual Acoustic Binding

Training on Soundspace-Matterport3D (mp3d)

CUDA_VISIBLE_DEVICES=0 python3 tools/train_vab.py --cfg configs/vab_mp3d.yaml

Reference Sampler (example on mp3d scenes)

CUDA_VISIBLE_DEVICES=0 python3 tools/ref_sample_mp3d.py --cfg configs/vab_mp3d.yaml --visualize-path output/ref_sampling/ --eval-metrics model.resume_path data/pretrained_weights/vab_pretrain.pth

SoundSpace-Ambient Experiments (mp3d)

Training

CUDA_VISIBLE_DEVICES=0 python3 tools/train_mp3d.py --cfg configs/soundvista_mp3d.yaml --visualize-path output/ref_sampling/ train.pretrained data/pretrained_weights/vab_pretrain.pth dataset.img_num_per_gpu 16 output_dir soundvista_mp3d

Evaluation

CUDA_VISIBLE_DEVICES=0 python3 tools/eval_mp3d.py --cfg configs/soundvista_mp3d.yaml --visualize-path output/ref_sampling/ --eval-scenes unseen model.resume_path data/pretrained_weights/soundvista_mp3d.pth output_dir soundvista_mp3d_eval

Demo (an example on one scene of mp3d)

# step 1: render route and reference pano RGBD and audio
python3 scripts/demo/mp3d_demovis.py

# step 2: render continuous target pano RGBD and video
python3 scripts/demo/mp3d_continuous_pano_render.py #pano RGB-D pkl file
python3 scripts/demo/mp3d_continuous_video_render.py #video

# step 3: render demo audio with SoundVista
CUDA_VISIBLE_DEVICES=0 python3 tools/demo_mp3d.py --cfg configs/soundvista_mp3d.yaml --visualize-path output/ref_sampling/ model.resume_path data/pretrained_weights/soundvista_mp3d.pth

# step 4: combine audio and video for the final demo video (fps=18)
e.g.:
ffmpeg -i 'demo_files/sT4fr6TAbpF_continuous_vis.mp4' -i 'demo_files/sT4fr6TAbpF.wav' -c:v copy -c:a aac  'demo_files/sT4fr6TAbpF_output.mp4'

Citation

If you find this repository and dataset useful in your research, please consider giving a star ⭐ and cite our paper by using the following BibTeX entrys.

@inproceedings{chen2025soundvista,
  title={SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding},
  author={Chen, Mingfei and Gebru, Israel D and Ananthabhotla, Ishwarya and Richardt, Christian and Markovic, Dejan and Sandakly, Jake and Krenn, Steven and Keebler, Todd and Shlizerman, Eli and Richard, Alexander},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={8331--8341},
  year={2025}
}

License

The code and dataset are released under CC-NC 4.0 International license.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs		configs
data/mp3d		data/mp3d
libs		libs
media		media
scripts/demo		scripts/demo
tools		tools
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding [CVPR 2025]

Real Demo

Dataset

SoundSpace-Ambient Matterport3D

Run the Code

Environment Setup

Visual Acoustic Binding

Training on Soundspace-Matterport3D (mp3d)

Reference Sampler (example on mp3d scenes)

SoundSpace-Ambient Experiments (mp3d)

Training

Evaluation

Demo (an example on one scene of mp3d)

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

facebookresearch/soundvista

Folders and files

Latest commit

History

Repository files navigation

SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding [CVPR 2025]

Real Demo

Dataset

SoundSpace-Ambient Matterport3D

Run the Code

Environment Setup

Visual Acoustic Binding

Training on Soundspace-Matterport3D (mp3d)

Reference Sampler (example on mp3d scenes)

SoundSpace-Ambient Experiments (mp3d)

Training

Evaluation

Demo (an example on one scene of mp3d)

Citation

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages