REVES - INRIA Sophia Antipolis
      2004 route des lucioles - BP 93
      FR-06902 Sophia Antipolis
      FRANCE
      Phone: (+33) (0)4 97 15 53 27
      Fax:     (+33) (0)4 92 38 50 30
      Email: emmanuel.gallo AT sophia.inria.fr

(NEWS) My phD thesis, entitled "Perceptual Sound Rendering for Multi-Modal Virtual Environments", is now online.
Abstract:

This thesis concentrates on real-time acoustic simulations for virtual reality applications or video games. Such applications require huge computing times, increasing with the complexity of the scene and involving difficulties for interactive rendering. In particular, the real-time simulation of a complex sound scene remains difficult due to the independent processing of each sound source. Moreover, the description of the auditory scene requires specifying the nature and the position of each sound source, which is a long and tedious process. To solve these problems, we studied the possibility of performing the acoustic simulation by leveraging the computing power of latest generation graphics processors. The results show that their massively parallel architecture is well suited to such processing, increasing significantly the performances compared to current general purpose processors. We were interested thereafter in developing an algorithm exploiting the human perception in order to render an auditory scene by respecting a target budget of operations while minimizing audible artifacts. The proposed algorithm evaluates an importance metric for each signal on very fine time-intervals. Then, it performs the required signal processing operations by descending priority order until the target budget is reached. A subjective evaluation was made to assess different importance metrics.
Finally, we developed an alternative method of sound acquisition which avoids the individual modeling of each source. From simultaneous monophonic recordings of a real scene, this method extracts the scene components. We analyze time-delay-of-arrival in the recorded signals in several frequency bands. From this information, a position is extracted for the most significant sound source in each band. The components from each source can then be re-rendered at the corresponding locations. Using this method, we can also edit the acquired scene. For instance, we can move or delete a sound source, or change the position of the listener in real-time. We can also composite several elements coming from different recordings while ensuring overall spatial coherence.
You can download it in pdf format (130Mb) (in English with a first chapter in French)

Curriculum Vitae    Research    Publications    Others