Séminaires Tim Miller

En juin et juillet, le LIMSI reçoit Timothy Miller comme professeur invité Digicosme en Traitement Automatique de la Langue Naturelle. Vous trouverez ci-dessous le programme de la série de séminaires qui seront donnés par Tim lors de son séjour.

[EDIT 09/08/2017 : lien vers les présentations : https://www.dropbox.com/sh/54qeosj43kkypwm/AABX8gdtYBFgcnSDSg0q_iN5a?dl=0]

Timothy Miller will be visiting LIMSI this summer as a Digicosme invited professor in Natural Language Processing. Please find below the program of the seminar series that Tim will give during his stay. You are all welcome to attend and meet him.

Mini-bio: Timothy Miller, PhD, is a scientist at the Computational Health Informatics Program (CHIP) at Boston Children's Hospital and an Instructor at Harvard Medical School. His research background is in computer science, with his thesis (2010) describing linear time syntactic models for speech repair. In his current position, he works on a variety of clinical natural language processing problems. He has made core contributions in temporal information extraction (Lin et al, 2014, Miller et al, 2013, Miller et al., 2015), UMLS relation extraction (Dligach et al, 2013), coreference resolution (Miller et al, 2012, Zheng et al, 2012, Miller et al., 2017a), and negation detection (Wu et al, 2014, Miller et al., 2017b). He also is a primary contributer to open source projects, including Apache cTAKES (clinical Text Analysis and Knowledge Extraction System) and ClearTK. He is currently interested in Bayesian grammar induction, temporal information extraction in the clinical domain, and domain adaptation for clinical NLP.

Bayesian Methods for Unsupervised Multilingual Grammar Induction.

Séminaire GT TSDT   6/06   14h LIMSI bat 508: Introduction to sequence models for Natural Language Processing

- Sequence models for NLP – Focus on hidden Markov models (HMMs), including inference techniques, which form the core of the method, and their more complex siblings the hierarchical HMMs (HHMMs).
- Optimizing HMM inference with modern GPU hardware
- Alternative sequence models, including CRFs and RNNs

Séminaire TLP  13/06 11h30 LIMSI  bat 508: Linear time parsing with HHMMs

- Parsing strategies – An introduction to bottom-up, top-down, and right-corner parsing, from a psycholinguistic perspective.
- Linear time parsing with HHMMs in a supervised machine learning framework

Séminaire CEA-LIST  22/06 10h CEA, amphi 34 bat 862: Bayesian inference for unsupervised POS tagging and parsing

- Bayesian inference for unsupervised POS tagging with HMMs, and unsupervised parsing with HHMMs

 

Topics in Clinical NLP

Séminaire ILES 4/07 14h LIMSI bat 508:  Generalizability and domain adaptation in clinical NLP
- Overview of pipeline approaches to clinical NLP
- Evidence from multiple tasks that performance degrades across tasks
- Introduction to unsupervised domain adaptation algorithms
- Preliminary work on domain adaptation for negation extraction

Séminaire GT D2K  12/07 9h LRI (bât PCRI) salle 455:  Coreference resolution: state of the art and application to biomedical text.

- Problem description, early systems, and applications – What is coreference, why is it important, what are some of the early methods, and what are some important use cases that rely on solving the coreference resolution problem?
- Machine learning approaches – An overview of common machine learning approaches for the task, including pairwise, mention-synchronous, agglomerative clustering, easy-first, and even some of the unsupervised approaches
- Biomedical coreference resolution – Domain-specific issues with solving coreference, as well as an introduction to domain-specific resources that are available for the task.
- Future directions for coreference research - An introduction to hot topics in coreference resolution, including search-based learning, neural-network based representation learning, and cross-document coreference, with suggestions for how these methods can be applied to biomedical texts.