TLP - Affective and social dimensions of spoken interactions

Members : Laurence Devillers, Gilles Adda, Claude Barras, Eric Bilinski (P2I), Joseph Mariani, Sophie Rosset (ILES), Ioana Vasilescu and Caroline Etienne.

We aim at training intelligent machines able to deal with affective and social dimensions in the interaction with humans living in a physical and social space.

Research challenges

• Fusion between linguistic and paralinguistic channel, multimodal fusion

• Cross-corpora experiments, adaptation learning techniques

• Affective dialog system, long-term relation-ship

• Social interaction with robots, ethics and affective robotics

NaoAffective and social dimension detection are being applied to both human-machine interaction with robots and in the analysis of audiovisual and audio documents such as call center data. The main research subjects in this area are emotion and social clues identification in human-robot interaction, emotion detection based on verbal and non verbal clues (acoustic, visual and multimodal), dynamic user profile (emotional and interactional dimensions) in dialog for assistive robotics, and multimodal detection of the anxiety applied to therapeutic serious games.

In order to design affective interactive systems, experimental grounding is required to study expressions of emotion and social clues during interaction. Socio-cultural clues are contrary to emotions voluntarily controlled. In human interaction, nonverbal elements such as gesture, facial expression and paralinguistic clues are valuable for a more precise understanding of the communicated message. Voice and speech play a fundamental role in social interactions but they have been relatively neglected, at least in the last years, compared to other aspects of social exchanges such as, facial expressions or gestures. There is a tendency within the area of emotion-oriented computing to use very exaggerated and unnatural emotional data portrayed by actors. It seems increasingly clear that this strategy is not effective, because the forms of expression that occur in natural interactions are fundamentally different from those that actors generate on command. Since 2001, the work on speech introduced in this theme is based on the use of genuinely naturalistic material. The team was one of the first to grasp the issue, and is one of a very small number of researchers who has now consistently taken on the challenge of finding, annotating and analysing databases of real-life emotional data. The team has collected and analysed emotional speech databases in financial consultations, calls for medical help and human-robot interactions. Studies were led on various levels of fear (stress, anxiety, fear panics), of anger (annoyance, anger), of sadness (disappointment, sadness, depression) and of positive feelings (relief, satisfaction, enjoyment, pride). Analysis techniques that extracted spectral, prosodic and affect burst markers and automatic emotion detection systems using sophisticated machine learning techniques such as Support Vector Machines (SVM) have been developed to understand this comprehensive data. Recent comparisons show that they are on a par with those developed by other members of the international community.

A social robot sensitive to emotions should not take only punctual emotions into account, but also have a representation of the emotional and interactional profile of the user along the interactions, in order to have a chance of being more relevant in its behavioural responses. We have studied the way paralinguistic clues impact the human-robot interaction as a first step, by linking the low-level clues computed from speech to an emotional and interactional profile of the user. Being able to predict which specific behaviour will have a chance to trigger pleasure in the user is a plus. For example, someone dominant and with a high self-confidence will not need to be encouraged to interact, and this encouragement could even be seen as irrelevant, even boring. The system would provide a closed interaction loop, where the robot would react to the emotional message of the human, and trigger an emotional response in the human according to relevant chosen behaviours. There are many cases where voice is not the only clue available to identify emotions and social stances. We propose in our next steps of research to extract multimodal dimensions using gaze tracking (with a webcam), posture detection (with a Kinect 3D sensor) and a few physiological clues such as EEG with non-invasive sensors in order to improve the performance of our systems.

Research subjects developed in the area are: Speaker and emotion identification in human-robot interaction, Emotion detection for analyzing the quality of Client/Agent interaction in call center data, Engagement in Human-Robot interaction, Emotion detection based on acoustic, visual and physiological clues for Assistive Robotics and finally Multimodal detection of the anxiety for the design of a serious game with therapeutic purpose.

Applications and projects

The detection of affective and social dimensions can be used for human-machine communication with robots but also for audiovisual documents analysis with goals of health, security, education, entertainment or serious games applications.

Robotics is a relevant framework for assistive applications due to the learning and skills of robots. Human-Robot interaction is an hot topic on robotics. This broad research area, encompassing social interaction among robots and humans pose many challenges to the community. In a near future, socially assistive robotics aims to address critical areas and gaps in care by automating supervision, coaching, motivation, and companionship aspects of one-to-one interactions with individuals from various large and growing populations, including the elderly, children, disabled people, and individuals with social phobias among many others.

CernaThe ethical issues, including safety, privacy, and dependability of robot behaviour, are also more and more widely discussed. It is thus necessary that a bigger ethical thought is combined with the scientific and technological development of robots, to ensure the harmony and acceptability of their relation with the human beings. We are also involved in the Ethical working group for research in robotics of CERNA (Committee on the Ethics of the Research in sciences and technologies of the Digital technology of Allistene).

Ethics, Goals and Societal impact in Affective Computing is also a central subject of AAAC (SIG Ethics)  (L. Devillers, B. Schuller (Imperial College London, UK- UK) : The AAAC is a professional, world-wide association for researchers in Affective Computing, Emotions and Human-Machine Interaction. The AAAC, formerly the HUMAINE Association, was founded in June 2007.The ambition of the SIG Ethics is to collect the main ethics, goals and societal impact questions of The AAAC community.

The pole co-evolution human-machine of the Digital Society Institut at Paris-Saclay (L. Devillers, Ch. Licoppe (I3 Telecom ParisTech)) carries the idea of distribution of the intelligence between the users and the machines (Robotics, objects connected (quantified self), intelligent house etc.). Seen under this perspective, the user learns the use of the machine at the same time as the machine adapts itself to him, putting questions of acceptability, design of interaction and ethics. The joint evolution of the technologies and the users implies one collaboration of the actors STIC and SHS during the projects from their design to their evaluation.

Pole co-evolution human–machine ISN colloque 9-10 April 2015

Social robotics projects

• ISN TE2R (2015-16) : Tracks, explanations and responsibility of the robot - The big challenges of robotics in the society: to understand and to build the digital society, carried by the laboratory CERDI (Université Paris-Sud) and by the laboratory LIMSI-CNRS.

• ISN Engagement in a social interaction with Robots (2015-16): robot-humain-Quantified-Self - Experiment innovative manners to strengthen the power of stimulation and attachment of the interaction man-robot, carried(worn) by the LIMSI-CNRS and I3-Telecom-Paris

  • ROMEO2 (Humanoid robot assistant) : Romeo 2We also participated in the Cap Digital FUI ROMEO project (2009-2012) and now participate in BPI PSPC ROMEO2 (2013-17) which has the main goal of building a social humanoid robot, a big brother of the NAO robot developed by Aldebaran Robotics, that can act as a comprehensive assistant to help persons suffering from autonomy loss. We are a member of the ROMEO Social Committee, which aims to provide a societal vision on the design of the robot.

Collaboration with Aldebaran-Robotics (R. Gelin), Spirops (A. Buendia) 

  • EU CHIST-ERA JOKER (JOKe and Empathy of a Robot/ECA) (2013-17) Empathie (Leader: L. Devillers): Towards social and affective relations with a robot, will emphasize the fusion of verbal and non-verbal channels for emotional and social behavior perception, interaction and generation capabilities. JOKER will emphasize the fusion of verbal and non-verbal channels for emotional and social behavior perception, interaction and generation capabilities. Our paradigm invokes two types of decision: intuitive (mainly based upon non-verbal multimodal cues) and cognitive (based upon fusion of semantic and contextual information with non-verbal multimodal cues.) The intuitive type will be used dynamically in the interaction at the non-verbal level (empathic behavior: synchrony of mimics such as smile, nods) but also at verbal levels for reflex small- talk (politeness behavior: verbal synchrony with hello, how are you, thanks, etc). Cognitive decisions will be used for reasoning on the strategy of the dialog and deciding more complex social behaviors (humor, compassion, white lies, etc.) taking into account the user profile and contextual information.

Collaboration with the LIUM (Y. Esteve), Koç University (M. Sezgin), Dublin University (N. Campbell), UMONS (S. Dupont)

  • ANR Tecsan ARMEN project (2010-13): we were involved in the designing and building of an assistive robot to maintain elderly people in their natural environment.

Collaboration with the CEA (Ch. Leroux)

 Project with therapeutic vocation

  • FEDER E-THERAPY (2012-2015) : design of immersive serious games with therapeutic vocation, based on the verbal and non-verbal interaction and the technique of role playing. We focuses on the automatic recognition of human stress during stress-inducing interactions (public speaking, job interview and serious games), using audio and visual cues. Stress expression and coping are influenced both by interpersonal differences (personality traits, past experiences, cultural background) and contextual differences (type of stressor, situation's stakes). We evaluated stress in various populations in data corpora collected during this project: social phobics in anxiety-inducing situations in interaction with a machine and with humans; apathologic subjects in a mock job interview; and apathologic subjects interaction with a computer and with the humanoid robot Nao. Inter-individual and inter-corpora comparisons highlight the variability of stress expression. A possible application of this work could be the elaboration of therapeutic software to learn stress coping strategies, particularly for social phobics.

Cooperation with A. Pelissolo at Pitié Salpêtrière hospital

Affective computing projects

  • ANR TECSAN COMPARSE (2011-15): Emotion(s), cognition, comportement (EMCO) Study of the relationships between cognition, motivation, and personality, for emotional adaptation and regulation, using empathic virtual simulation - The main outcome will be modelling of the relationship between emotion and cognition, namely, which component parts mediate the relationship (e.g. emotional regulation, self-efficacy precepts) and which parts (personality traits) moderate overall effects on performance. The applied consequences of the present project will be turned towards e-learning as well as the cognitive remediation of altered functions when working with patients presenting a dysfunction of the emotion-cognition links.

Collaboration with CPU team at LIMSI-CNRS and STAPS (Paris-Saclay).

  • HUMAINE (Human-Machine Interaction Network on Emotion) (2004-08) : a Network of Excellence in the EU's Sixth Framework Programme. 33 partners from 14 countries participated in the network. HUMAINE aims to lay the foundations for European development of systems that can register, model and/or influence human emotional and emotion-related states and processes - 'emotion-oriented systems'. It identifies six thematic areas that cut across traditional groupings and offer a framework for an appropriate division of labour - theory of emotion; signal/sign interfaces; the structure of emotionally coloured interactions; emotion in cognition and action; emotion in communication and persuasion; and usability of emotion-oriented systems.

Collaboration with UNIGE (K. Scherer), with Erlangen University (A. Batliner), with Paris VIII University (C. Pelachaud), with Belfast University (R. Cowie)

  • ANR affective Avatars project (2004-07) (Leader : L. Devillers): The Affective Avatars project's goal is to create affective avatars in real-time; the avatar's expressiveness is controlled by the user's voice. The expressive end emotional parameters extracted from the real-time processing of the vocal signal are used to pilot the expressiveness of the lips, faces, and bodies of the avatars. The user's vocal pitch is also transformed in real-time, giving the avatar a voice that is in keeping with its image. Four scientific/technological issues will be addressed in this project: detection of emotions in the human voice, modeling physical expessiveness (bodytalk), transformation of vocal timbre in real-time, modeling a catalogue of vocal profiles with expressiveness and multimodal cohesion.

Collaboration with SME Voxler (N. Delorme), with IRCAM (X. Rodet)

Projects on emotions with call centers

  • Cap Digital FUI VoxFactory project (2009-11) : speech analytics. we aim to analyse the quality of Client/Agent interactions in call center data via automatic emotion and sentiment detection and analysis.
  • CHIL (Computers in the Human Interaction Loop) (2004-07): multimodal perceptual user interfaces capable of tracking, identifying, recognizing and understanding emotions.
  • Amities (Automated Multilingual Interaction with Information and Services) (2001-04) : emotion detection in call centers


Campus universitaire bât 507
Rue du Belvédère
F - 91405 Orsay cedex
Tél +33 (0) 1 69 15 80 15


Scientific report

LIMSI in numbers

8 Research Teams
100 Researchers
40 Technicians and Engineers
60 Doctoral Students
70 Trainees


Paris-Saclay University new window

Logo DataIA