Rechercher  


Version française English version
INS2I Annuaire LIMSI
   
Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur
Logo LIMSI
Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

[ Dérouler vers : Contenu, Menus, Bannière, Aide à la navigation. ]

Augmented and Virtual Reality & Audio Interfaces

Global presentation

During the last decade, virtual and augmented reality have taken more and more place in our daily life. Through medias, simulation tools, etc... audio-visual rendering devices have evolute significantly to respond to different needs.

However the spatial dimension of the audio modality is still being relatively unemployed whereas it can be of great benefits in numerous audio-visual applications. First of all, it has been shown that the perceived quality of an audio-visual rendering device depends on the rendering quality of both modalities. As such, adding 3D sound to existing applications allows for a sensible increase in the quality of the rendering proposed by these systems. On the other hand, 3D sound also increase the intelligibility of audio-visual complex scenes.

The sub-theme Augmented and Virtual Reality & Audio Interfaces aims at studying what can be the impact of 3D sound on these different types of applications.

This sub-theme is structured around 3 projects:

  • The NAVIG project: This project aims at providing an augmented audio rendering to blind people to help them in their daily life but without preventing them from perceiving their normal audio environment.
  • The SMART-I²: The SMART-I² is a spatialized audio-visual rendering system of high-quality. This project aims at increasing intelligibility and immersion sensation in audio-visual interfaces for virtual reality.
  • Audio-visual Renderings for Multimedia Navigation: This axe is looking for ways to increase big database exploration interfaces by adding them 3D sound and by transposing to them metaphors coming from the visualization field.
navig

Project presentation

The ANR-NAVIG project aims to increase the visually impaired autonomy in a primary and particularly problematic action: the navigation.

The consortium is composed of:

Through a participatory design method, we expected to enable visually impaired to move to a desired destination reliably and securely, without interfering with their normal travel behaviour. Furthermore, the device will allow the possibility to locate and grasp object without the necessity to pre-equip them with electronic component. The AA team objective on the project NAVIG is to develop a binaural synthesis engine that increase reality with auditory information to locate visual targets and to reach a destination while avoiding obstacles.

navig
Prototype of NAVIG system.

3D auditory perception

Auditory 3D information restitution is made using a binaural synthesis engine developed at LIMSI and running on the real-time convolution of HRIR. To obtain better localization performance with non-individualized HRTFs and reduce the up / down and front / rear confusion problem specific to the binaural technology, we developed a game that, based on the plasticity of the hearing system, allows the user to adapt to HRTFs that are not hers.

Auditory guidance

The objective of auditory guidance is to provide visual information in the form of audio information. We envisage various types of information depending on the type of guidance.

  • Near field guidance

    The near field guidance is intended to convey the position, the size and the shape of an object. To better guide the grasping task, guidance sound must also be able to give information on the congestion of the path between the hand and the object. We have previously examined the precision of hand reaching movement towards nearby real acoustic sources through a localization accuracy task. Results showed that the accuracy of localization varies relative to the source stimulus, azimuth, and distance. Taking into account these results we are now preparing a grasping task experiment with virtual sounds and different stimulus to optimize the accuracy of localization.

  • Far field guidance

    The guidance during navigation should allow the user to know the indications on the future trajectory, the reference points near the route and all the necessary information to provide a good mental space representation to the user.Tests are underway on spatialised text to speech and metaphors of sonification.

  • Environment description

    Environment description consists to allow the user an understanding of the situation. For use before and during navigation, this description must contain different details levels in order not to overload the user during navigation and to taking into account capacity constraints of working memory for the upstream description. The objective is to enable the user to build an integrated cognitive representation of the environment in the frame of reference is his.

Ergonomics and Sound Design

  • Restitution sound choice

    In order not to be regarded as unpleasant, sound design will be based on style sheets allowing the user to select different types of sounds that allow the guidance. We seek to avoid existing systems approaches based on Text to speech and sound tags that too often causing cognitive overload. First meetings with users show that, while some prefer to be guided with electronic sounds (although differentiable sounds of the environment), other prefer natural sounds (considered less unpleasant). This tends to demonstrate the usefulness of style sheets.

  • Headphone choice

    To avoid the masking problem of real environmental sound by the system's sounds, we study binaural spatialisation quality with bones-phone and air-tubes.

Publications

  • Florian Dramas, Brian F.G. Katz, Christophe Jouffrais. ''Auditory-guided reaching movements in the peripersonal frontal space''. Acoustics, Paris, Vol. 123, Acoustical Society of America, p. 3723, 2008.
  • Florian Dramas, Bernard Oriola, Brian F.G. Katz, Simon Thorpe, Christophe Jouffrais. ''Designing an assistive device for the blind based on object localization and augmented auditory reality''. ACM Conference on Computers and Accessibility (ASSETS 2008), Halifax, Canada, 13/10/08-15/10/08.
  • Florian Dramas, Simon Thorpe, Brian F.G. Katz, Christophe Jouffrais. ''Object recognition and localization for the blinds. From the assistive device towards the neuroprosthesis''. From Neural Code to Brain/Machine Interface, Paris, 27/09/07-29/09/07.
  • Gaétan Parseihian, Brian F.G. Katz. ''Conception d'un moteur de rendu audio binaural pour l'aide à la navigation des non-voyants''. Journées des Jeunes Chercheurs en Audition, Acoustique musicale et Signal audio, Marseille, 25/11/09-27/11/09.
  • Brian FG Katz, Philippe Truillet, Simon Thorpe, Christophe Jouffrais. ''NAVIG: Navigation Assisted by Artificial Vision and GNSS''. Workshop Pervasive 2010: Multimodal Location Based Techniques for Extreme Navigation, Helsinki, 17/05/2010.
  • Gaétan Parseihian, Adrien Brilhaut, Florian Dramas. ''NAVIG: An Object Localization System for the Blind''. Workshop Pervasive 2010: Multimodal Location Based Techniques for Extreme Navigation, Helsinki, 17/05/2010.

SMART-I² Project

People working on this project: Marc Rébillat, Xavier Boutillon, Brian F.G. Katz and Etienne Corteel (collaborator)

Project presentation

The SMART-I² project involves 3 different partners: The LIMSI-CNRS, the LMS (Solid Mechanics Laboratory ) and sonic emotion (a swiss company specialist in 3D sound ). This project aims at designing a device able to propose to several users a physically coherent audio-visual rendering with interaction possibilities.

SMART-I² is a high quality 3D audio-visual interactive rendering system. The 3D visual rendering is made with Tracked Passive Stereoscopy. In SMART-I², the screen is also used as a multichannel loudspeaker. The spatial audio rendering is based on Wave Field Synthesis. Contrary to conventional systems, SMART-I² is able to realize a high degree of 3D audio-visual integration with almost no compromise on either the audio or the graphics rendering quality.

Spatialized sound rendering with Wave Field Synthesis

Wave Field Synthesis (WFS) is a spatialized sound rendering technology which was first really developed at Delft University. It is an audio implementation of Huygen's principle, which states that: Every sound field emerging from one primary sound source can be reproduced by summing contributions of an infinite and continuous distribution of secondary sound sources. At the theorical level, WFS allows one to synthesize a sound source at any given position. Implementations of WFS are simplified versions of this principle, typically using a linear array of equally spaced loudspeakers.

Illustration of sound rendering using WFS.
smartii

This figure illustrates the principle of WFS. The violin on the left part is the primary source producing the target natural sound field. The linear array of secondary sound sources on the right produces, through summation of the contributions of each loudspeaker driven appropriately, a synthesized sound field equivalent to the original target field. The sound field of the virtual violin is synthesized, perceived by users in the reproduction area as emanating from the precise spatial location of the violin. Additional sound sources may be simultaneously synthesized through simple linear superposition.

Tracked Passive Stereoscopy

To produce a 3D visual rendering, each eye of the user have to see the same scene from a slightly different point of view. One means of realizing this is to use light polarization properties to independently address each eye of the user. The user wears special polarized glasses for visual cross-talk cancellation. The graphic rendering should also be adapted to the position and orientation of the user's head in order to always render the correct point of view. Using this approach, the 3D visual rendering is coherent regardless of the user's position in the immersion area.

Illustration of stereoscopy.
smartii

Audio-visual integration with Multi-actuator Panels

The SMART-I² integrates two different technologies, tracked passive stereoscopy and WFS, through an innovative use of multi-actuator panels (MAPs). MAPs are stiff lightweight panels with multiple electro-mechanical exciters attached to the backside. Typical MAP multichannel loudspeakers are not larger than 1m². For this project, a novel large dimension MAP has been designed, (i.e. 5m² with a 4/3 ratio) in order to provide sufficient surface area and size to be used as a projection screen. To accommodate polarized light projection, the front face of the panel has been covered with metallic paint designed to preserve light polarization. Due to the nature of the MAP design, screen displacements caused by acoustic vibrations are very small and do not disturb 3D video projection on the surface of the panel. Such a structure then allows one to efficiently integrate a 3D visual rendering technology and a spatialized sound rendering technology.

Front face
smartii
Back face
mapback

Architecture

The hardware architecture of the SMART-I² is schematically presented the following figure. Two large MAPs of 2.6m times 2m form a corner of stereoscopic screens and a 24 loudspeakers array. With this configuration, users can move within an immersion area of approximately 2.5m times 2.5m. An example of a simple AV scene rendered by the SMART-$I^2$ is given below.

Global organization of the SMART-I².
smartii

Perspectives and applications

There is numerous perspectives for this project. As it is the first time that Multi-acuator Panels of such a size are being used in this context, it is of great important to understand their physical behavior to improve the audio rendering quality.

Applications are numerous too. The more evident one is commercial. As this device is relatively low cost compared to other similar systems, the SMART-I² can have applications as teleconference or video games. Moreover this device is also a good device of audio-visual rendering for virtual reality. It can also be used in this context: therapies in virtual environments, psychophysics experiments, etc...

Related publications

  • M. Rébillat, E. Corteel, B.F.G. Katz, "The SMART-I²: A new approach for the design of immersive audio-visual environments.", Euro-VR Eve 2010, Orsay, France, May 2010. [Preprint-pdf]
  • M. Rébillat, E. Corteel, B.F.G. Katz, X. Boutillon, "Identification, modélisation et contrôle de Large Multi-Actuator Panels pour la création d'un rendu audio-visuel spatialisé.", Journées des Jeunes Chercheurs en Audition, Acoustique musicale et Signal audio, Marseille, November 2009. [Poster-pdf]
  • M. Rébillat, E. Corteel, B.F.G. Katz, "SMART-I²: Spatial Multi-users Audio-visual Real Time Interactive Interface, a broadcast application context", 3DTV Conference, Potsdam, Germany, May 2009. [Preprint-pdf]
  • M. Rébillat, E. Corteel, B.F.G. Katz, "SMART-I²: A Spatial Multi-users Audio-visual Real Time Interactive Interface", 125th Convention of the Audio engineering Society, San Francisco, October 2008. [Preprint-pdf]

People working on this project: Tifanie Bouchara, Christian Jacquemin , Brian F.G. Katz , and Catherine Guastavino (collaborator).

Project presentation

This study investigates various combinations of auditory and visual modalities in order to enhance multimedia exploration interfaces. We suggest to extend the graphical techniques used in the visualization domain to the auditory domain to design zoomable auditory interfaces. We have developped a \textbf{magnifying lens} that modifies simultaneously and coherently the auditory and graphical renderings. The graphical rendering is similar to Fisheye Lens distortion while the auditory rendering is processed in two step : first we spatialized the sounds in 2D or 3D then we modified the spatial postion of sound sources (azimut, elevation, distance) according to their visual position. Further studies will focus on content-based zoomable interfaces with semantic distortion rather than spatial distortion.

Research topics of this auditory and multimodal interfaces study (see Tifanie Bouchara's web site) are : Human-Computer Interaction, zoomable interfaces, sound spatialization, auditory perception and cognition, audiovisual crossmodality.

Graphical rendering of the exploration interfaces developped to browse video collection: top) Pan & Zoom techniques, bottom) Fisheye lens.
smartii

Related publications

  • T. Bouchara, C. Guastavino, B.F.G. Katz, C. Jacquemin, "Audiovisual Rendering for Multimedia Navigation.", submitted, 2010.
  • T. Bouchara, C. Guastavino, B.F.G. Katz, C. Jacquemin, "Conception d'une Lentille Grossissante Audiovisuelle pour l'Exploration de Base de Données Multimédias", Journées des Jeunes Chercheurs en Audition, Acoustique musicale et Signal audio, Marseille, November 2009. [Poster-pdf]

[ Dérouler vers : Contenu, Menus, Bannière, Aide à la navigation. ]

[ Dérouler vers : Contenu, Menus, Bannière, Aide à la navigation. ]