Information Extraction, Focused Information Retrievel, and Question-Answering
Permanent staff: S. Ghannay, C. Grouin, T. Hamon, G. Illouz, T. Lavergne, AL. Ligozat, A. Névéol, S. Rosset, P. Zweigenbaum, with contributions by P. Paroubek et A. Vilnat.
PhD students: H Boulanger, O Cattan, JM Coria, H El Boukkouri, L Galmant (ILES/TLP), C Masson, N Paris, TF Randriatsitohaina, LP Schaub, M Véron
The methods developed in this theme address two goals.
The first goal is focused on the recognition of precise information in texts, with two main fields of study:
- Information extraction: recognition and typing of targeted information in texts (such as entity and relation extraction) in order to build knowledge bases or analyze texts.
- Focused information retrieval: locating target information in documents or knowledge bases in order to answer a query or natural language question.
The second goal focuses on modeling processes using natural language to query machines in the context of personal assistants or information retrieval either in specialized domains (e.g. on a commercial site, in scientific texts) or in open domain (search in a knowledge base or encyclopedia).
Topics of interest include:
- Named-Entity recognition in general and specialized domain (mainly biomedical): recognition of complex NE types while dealing with lexical sparsity
- Relation extraction in general and specialized domain (mainly biomedical), with supervised and unsupervised approaches, based on surface information and structured representations
- Event recognition and temporal information, automatic creation of timelines
- Opinion mining
- Semantic inference and reasoning for answering questions by searching a text or knowledge base
- Modeling of human-computer interaction in natural language, dialogue systems