Natalia TOMASHENKO (Laboratoire d’Informatique de l’Université du Maine, Le Mans Université)

14 novembre 2017 à 11h30

Speaker adaptation of DNN acoustic models using Gaussian mixture model framework in ASR systems

Adaptation is an efficient way to reduce mismatches between models and data from a particular speaker or channel in automatic speech recognition (ASR) systems. In this work we present a novel speaker adaptation method for deep neural network (DNN) acoustic models. The idea of the proposed approach is based on using so-called GMM-derived features as input to a DNN. This technique of processing features for DNNs makes it possible to use GMM-HMM adaptation algorithms in the neural network framework. Adaptation to a new speaker can be performed by adapting an auxiliary GMM-HMM model, used in calculation of GMM-derived features, and can be regarded as adaptation in the feature space for a DNN system. The proposed approach is explored in the framework of various state-of-the art ASR systems and is shown to be effective in comparison with other speaker adaptation techniques and complementary to them.

Campus universitaire bât 507
Rue du Belvedère
F - 91405 Orsay cedex
Tél +33 (0) 1 69 15 80 15


Rapport scientifique


Le LIMSI en chiffres

7 équipes de recherche
100 chercheurs et enseignants-chercheurs
40 ingénieurs et techniciens
60 doctorants
70 stagiaires


Paris-Saclay nouvelle fenêtre

Logo DataIA