The Spoken Language Processing group carries out research aimed at understanding the human speech communication processes and developing models for use in automatic processing of speech. With the aim of extracting and structuring information in audio documents, the group develops models and algorithms that use diverse sources of information to carry out a global decoding of the signal, that can be applied to identify the speaker, the language being spoken and the speaker affective state, to transcribe the speech or translate it, or identify specific entities. This research is by nature interdisciplinary, drawing upon expertise in signal processing, acoustic-phonetics, phonology, semantics, statistics and computer science
The group activities cover the following application areas: speech recognition, language identification, multimodal characterization of speakers and their affective state, named-entity extraction and question-answering, spoken dialog, multimodal indexing of audio and video documents, and machine translation of both spoken and written language.
The research activities are is structured in seven interdependent topics: 1. Speaker characterization in a multimodal context; 2. Affective and social dimensions of spoken interactions; 3. Perception and automatic processing of variation in speech; 4. Robust analysis of spoken language and dialog systems; 5. Translation and machine learning; 6. Speech recognition; and 7. Language resources.