We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.SD

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Sound

Title: SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning

Abstract: We introduce SoundSpaces 2.0, a platform for on-the-fly geometry-based audio rendering for 3D environments. Given a 3D mesh of a real-world environment, SoundSpaces can generate highly realistic acoustics for arbitrary sounds captured from arbitrary microphone locations. Together with existing 3D visual assets, it supports an array of audio-visual research tasks, such as audio-visual navigation, mapping, source localization and separation, and acoustic matching. Compared to existing resources, SoundSpaces 2.0 has the advantages of allowing continuous spatial sampling, generalization to novel environments, and configurable microphone and material properties. To our knowledge, this is the first geometry-based acoustic simulation that offers high fidelity and realism while also being fast enough to use for embodied learning. We showcase the simulator's properties and benchmark its performance against real-world audio measurements. In addition, we demonstrate two downstream tasks -- embodied navigation and far-field automatic speech recognition -- and highlight sim2real performance for the latter. SoundSpaces 2.0 is publicly available to facilitate wider research for perceptual systems that can both see and hear.
Comments: Camera-ready version. Website: this https URL Project page: this https URL
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
Cite as: arXiv:2206.08312 [cs.SD]
  (or arXiv:2206.08312v2 [cs.SD] for this version)

Submission history

From: Changan Chen [view email]
[v1] Thu, 16 Jun 2022 17:17:44 GMT (1506kb,D)
[v2] Mon, 23 Jan 2023 18:49:47 GMT (1362kb,D)

Link back to: arXiv, form interface, contact.