3D-Audio Matting, Post-Editing and Re-rendering from Field Recordings

3D-Audio Matting, Post-Editing and Re-rendering from Field Recordings

E. Gallo^1,2, N. Tsingos¹ and G. Lemaitre^1

1REVES-INRIA and ²CSTB

Left: We use multiple arbitrarily positioned microphones (circled in yellow) to simultaneously record real-life auditory environments. Middle: We analyze the recordings to extract the positions of various sound components through time. Right: This high-level representation allows for post-editing and re-rendering the acquired soundscape within generic 3D audio rendering architectures.

We present a novel approach to real-time spatial rendering of realistic auditory environments and sound sources recorded live, in the field.
Using a set of standard microphones distributed throughout a real-world environment we record the sound-field simultaneously from several locations. After spatial calibration, we segment from this set of recordings a number of auditory components, together with their location. We compare existing time-delay of arrival estimations techniques between pairs of widely-spaced microphones and introduce a novel efficient hierarchical localization algorithm. Using the high-level representation thus obtained, we can edit and re-render the acquired auditory scene over a variety of listening setups. In particular, we can move or alter the different sound sources and arbitrarily choose the listening position. We can also composite elements of different scenes together in a spatially consistent way. Our approach provides efficient rendering of complex soundscapes which would be challenging to model using discrete point sources and traditional virtual acoustics techniques. We demonstrate a wide range of possible applications for games, virtual and augmented reality and audio-visual post-production.

Download a video describing our technique and early results here ! (divx format)

(NEW) Video comparing original monophonic recordings and our approach (divx format)
(full FIR filtering using head-related transfer function (HRTF) data from the LISTEN HRTF database)

Additional example results

Explicit background/foreground separation and resulting re-renderings

Example 1: Outdoor scene with two moving speakers

One of the 8 original mono recordings

Separated and re-rendered background / foreground and both

Another Re-rendering (using software renderer and HRTFs from the LISTEN HRTF database.)

Example 2: Seashore scene

(click on picture for a larger view of the image-based calibration using ImageModeler © RealViz)

One of the 8 original mono recordings

Separated and re-rendered background / foreground and both

Another Re-rendering (using software renderer and HRTFs from the LISTEN HRTF database.)

Comparison of our warping algorithm (delay/distance compensation based on estimated source positions)
compared to direct blending between recordings

Example 1: Synthetic case with telephone + chopper mixture

Reference mono recording

Mono blend between the two microphones closest to the virtual listener

Mono warp and blend between the two microphones closest to the virtual listener

Example 2: Indoor recording with two speakers

Reference mono recording

Mono blend between the two microphones closest to the virtual listener

Mono warp and blend between the two microphones closest to the virtual listener

More recent and improved results in an indoor environment
(using software renderer and HRTFs from the LISTEN HRTF database.)

Binaural reference
(recorded with a Fostex FR-2 using a pair of Sennheiser MKE2 gold omni mikes taped inside the ears of a subject)

Original mono recording (recorded with AudioTechnica AT3031 omni mike + Presonus Firepod)

Same mono recording processed with our 3D mapping (using only 8 subbands + software HRTF rendering)

Two re-renderings with moving virtual listening point : Example1 Example2

Compositing of two auditory scenes (car + moving speakers).
This demo includes a moving listening point plus virtual occluder (magenta wall)

Click here for the Divx movie file (12 subbands + hardware HRTF rendering using DirectSound and SoundBlaster Audigy)

Related publications

This work is submitted for publication

Acknowledgments

This research was made possible by a grant from the région PACA and the RNTL Project OPERA.
We acknowledge the generous donation of Maya as part of the Alias research donation program,
Alexander Olivier-Mangon for the initial model of the car, and Frank Firsching for the animation.