TLP - Thèmes de recherche

Caractérisation du locuteur dans un contexte multimédia

Speaker recognition consists of determining who spoke when, where the identity can be that of the true speaker or an identity specific to one document or a set of documents. Different sources of information can be used to identify the speaker in multimedia documents (the speaker's voice, what is said, or what is written. The group is leading the QCOMPERE consortium for the REPERE challenge.

Dimensions affectives et sociales des interactions vocales

Affective and social dimension detection are being applied to both human-machine interaction with robots and in the analysis of audiovisual and audio documents such as call center data. The main research subjects in this area are emotion and social cues identification in human-robot interaction, emotion detection based on verbal and non verbal cues (acoustic, visual and multimodal), dynamic user profile (emotional and interactional dimensions) in dialog for assistive robotics, and multimodal detection of the anxiety applied to therapeutic serious games.

Perception et traitement automatique de la variation dans la parole

The very large corpora used for training statistical models are exploited for linguistic studies of spoken language, such as acoustic-phonetics, pronunciation variation and diacronic evolution. Automatic alignment enables studies on hundreds to thousands of hours of data, permitting the validation of hypotheses and models. This topic also studies human and machine transcription errors via perception experiments.

Analyse robuste de la langue parlée et système de dialogue

Robust analysis methods for the spoken language are developed in the framework of open domain information retrieval with applications to language understanding for dialog systems, to named-entity recognition, and to interactive question answering systems supporting both spoken and written languages.

Apprentissage artificiel et traduction automatique

Nos recherches en traduction automatique se concentrent sur la conception et le développement de méthodes statistiques pour modéliser les associations entre des énoncés et leur traduction. Ces méthodes d'apprentissage automatique trouvent plus largement des applications dans tous les domaines du traitement automatique des langues impliquant une dimension multilingue.

Reconnaissance de la parole

Speech recognition is the process of transcribing the speech signal into text. Depending upon the targeted use, the transcription can be completed with punctuation, with paralinguistic information such as hesitations, laughter or breath noises. Research on speech recognition relies on supporting research in acoustic-phonetic modeling, lexical modeling and language modeling (a problem also addressed for machine translation), which are undertaken in a multilingual context (18 languages). This topic also includes research on language recognition, that is determining the language and/or dialect of an audio document for both wideband and telephone band speech.

Des ressources langagières aux données 3M

In addition to the collection, annotation and sharing of varied corpora, this research topic addresses more general investigations on Language Resources, covering data, tools, evaluation and meta-resources (guidelines, methodologies, metadata, best Practice), for spoken and written language, but also for multilingual, multimodal, and mutimedia data. Those activities are mostly conducted in collaboration with national and international organizations and networks.

Campus universitaire bât 507
Rue du Belvedère
F - 91405 Orsay cedex
Tél +33 (0) 1 69 15 80 15


Rapport scientifique


Le LIMSI en chiffres

8 équipes de recherche
100 chercheurs et enseignants-chercheurs
40 ingénieurs et techniciens
60 doctorants
70 stagiaires

 Université Paris-Sud nouvelle fenêtre


Paris-Saclay nouvelle fenêtre