LIMSI logo
Search 
 
    The CNRS LIMSI Directory
   
Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur
 

Spoken Language Processing Group (TLP)

Machine Translation @ LIMSI



SMT@LIMSI > Research themes

Research on machine translation is primarily oriented towards improving existing statistical machine translation (SMT) systems, or more generally data-driven machine translation engines. In a nutshell, SMT systems rely on the statistical analysis of large bilingual corpora to train stochastic models of the mapping between a source and a target language. In their simplest form, these models correspond to probabilistic rational relations between source and target strings of words, as initially formulated in the famous IBM models in the early nineties. More recently, these models have extended to capture more complex representations (eg. chunks, trees, or dependency structures) and the possible probabilistic relasionships between these representations. Such models are typically trained from parallel corpora, ie from examples of source texts aligned with their translation(s), where the alignment is typically defined at the subsentential level.

In this context, LIMSI is developping its research activities in several directions, from the design of word and phrase alignment models, to the conception of novel translation or language models; from the exploration of new training or tuning methodologies to the development of new decoding strategies. All these innovations need to be evaluated and diagnosed, and we also devote a significant fraction of our efforts to address the vexing issue of quality measurements in MT outputs. All these activities have been published in a number of international conferences or journal (see the Publications section). We are finally involved in a number of national and international projects (see the Project section below.)

Regarding alignment models, most of our recent work deals with the design and training of discriminative alignment techniques (Tomeh et al, 2011a, 2011b, 2010b; Allauzen & Wisniewski, 2009) to be used either to actually compute word alignments, to symmetrize existing word alignments, or to refine the extraction process. Recent work (Lardilleux et al, 2011) explores alternative alignment techniques, based on a phrase association measure.

Our main decoder, N-code, belongs to the class of n-gram based systems. In a nutshell, these systems define the translation as a two step process, where an input source sentence is first reordered non-deterministically yielding a input word lattice containing several possible reorderings. This lattice is then translated monotonically using a bilingual n-gram model; as in the more standard approach, hypotheses are scored using a battery of probabilistic models, whose weights are tuned with minimum error weight training. Recent evolutions of this approach are described in (Crego & Yvon, 2009, 2010a, 2010b). This system is now released as open source software (see Ncode web pages); an online demo is also available. As an alternative training strategy, we have recently proposed a CRF-based translation model (Lavergne et al, 2011).

Our activities are not restricted to these core modules of SMT systems, and we are investigating many other aspects of SMT systems, such as tuning (Sokolov & Yvon, 2011), multi-source machine translation (Crego & al 2010a, 2010b), evaluation of MT (Max & al 2010, Wisniewski & al, 2010), extraction of parallel sentences from comparable corpora (Braham-Ghabiche & al, 2011), etc.

Activities in SMT are finally closely related to the work carried out on language modeling, a theme on which LIMSI has been contributing for many years. A major recent contribution is the work on Neural Network Language models, initiated in (Gauvain & Schwenk, 2002), and recently revisited in (Le & al, 2010, 2011, 2012).

Our research activities are conducted in close relationship with several academic and industrial partners in the context of several national and international projects. A partial list of these projects is given below.

LIMSI's systems have taken part in several international MT evaluation campaigns. This includes a yearly participation to the WMT evaluation series (2006-2012), where LIMSI has consistently been amongst the top ranking systems, especially when translation into French is concerned. We have also ran the 2009 NIST MT evaluation for the Arabic-English task, as well as the IWSLT evaluations in 2010 and 2011.

LIMSI has recently been actively involved in the organization of various scientific events: EAMT 2010 in St Raphaël and IWSLT 2010 in Paris, as well as the Tralogy series.

SMT@LIMSI > Publications

2013

Journal Papers

  • Marianna Apidianaki, Nikola Ljubesic and Darja Fiser. Vector disambiguation for translation extraction from comparable corpora. Informatica, 37(2):193-201, 2013.
  • Adrien Lardilleux, François Yvon, Yves Lepage. Generalizing sampling-based multilingual alignment. Machine Translation, 27(1):1-23, 2013. download
  • Hai-Son Le, Ilya Oparin, Alexandre Allauzen, Jean-Luc Gauvain, François Yvon. Structured Output Layer Neural Network Language Models for Speech Recognition. Audio, Speech, and Language Processing, IEEE Transactions on, 21(1):197-206, 2013. download
  • Nadi Tomeh, Alexandre Allauzen, François Yvon. Maximum-entropy word alignment and posterior-based phrase extraction for machine translation. Machine Translation, pages 1-38, 2013. download>
  • Guillaume Wisniewski, François Yvon. Oracle decoding as a new way to analyze phrase-based machine translation. Machine Translation, 28(2):1-24, 2013. download
  • Guillaume Wisniewski, AnilKumar Singh, François Yvon. Quality estimation for machine translation: some lessons learned. Machine Translation, pages 1-26, 2013. download

International Conferences

  • Marianna Apidianaki, Nikola Ljubesic and Darja Fiser. Cross-lingual WSD for Translation Extraction from Comparable Corpora. In 6th Workshop on Building and Using Comparable Corpora (BUCC), Sofia, Bulgaria, 2013.
  • Marianna Apidianaki. Cross-lingual Word Sense Disambiguation using Translation Sense Clustering. In 7th International Workshop on Semantic Evaluation (SemEval 2013), Atlanta, Georgia, US, 2013.
  • Alexandre Allauzen, Nicolas Pécheux, Quoc Khanh Do, Marco Dinarelli, Thomas Lavergne, Aurélien Max, Hai-Son Le, François Yvon. LIMSI $@$ WMT13. In Proceedings of the Eighth Workshop on Statistical Machine Translation, Pages 62-69, Sofia, Bulgaria, August 2013. download
  • Stephan Peitz, Saab Mansour, Matthias Huck, Markus Freitag, Hermann Ney, Eunah Cho, Teresa Herrmann, Mohammed Mediani, Jan Niehues, Alex Waibel, Alexander Allauzen, Quoc Khanh Do, Bianka Buschbeck, Tonio Wandmacher. Joint WMT 2013 Submission of the QUAERO Project. In Proceedings of the Eighth Workshop on Statistical Machine Translation, Pages 185-192, Sofia, Bulgaria, August 2013. download
  • Anil Kumar Singh, Guillaume Wisniewski, François Yvon. LIMSI Submission for the WMT'13 Quality Estimation Task: an Experiment with N-Gram Posteriors. In Proceedings of the Eighth Workshop on Statistical Machine Translation, Pages 398-404, Sofia, Bulgaria, August 2013. download
  • Guillaume Wisniewski, François Yvon. Fast large-margin learning for statistical machine translation. In International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2013), Samos, Greece, 2013.
  • Guillaume Wisniewski, Anil Kumar Singh, Natalia Segal, François Yvon. Design and Analysis of a Large Corpus of Post-Edited Translations: Quality Estimation, Failure Analysis and the Variability of Post-Edition. In Machine Translation Summit (MT Summit 2013), Pages 117-124, Nice, France, 2013.

National Conferences

  • Guillaume Wisniewski, François Yvon. La tâche de prédiction de qualité. In Proceedings of Tralogy II, Pages 1-16, Paris, France, 2013.
  • Thomas Lavergne, Thomas Lavergne, Alexandre Allauzen, François Yvon. un cadre d’apprentissage intégralement discriminant pour la traduction statistique. In Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2013), Les Sables d'Olonne, 2013.
  • Guillaume Wisniewski, Anil Kumar Singh, Natalia Segal, François Yvon. Un corpus d’erreurs de traduction. In Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2013), Sables d’Olonne, France, 2013.

2012

Book Chapters

  • Alexandre Allauzen, François Yvon. Textual Information Access. In Statistical Methods for Machine Translation, Eric Gaussier, François Yvon (eds.), Chap. 7, pp. 223-304, ISTE/Wiley, Paris, 2012.

International Conferences

  • Wang Ling, Nadi Tomeh, Guang Xiang, Alan Black, Isabel Trancoso Improving Relative-Entropy Pruning using Statistical Significance. Proceedings of the 24th International Conference on Computational Linguistics (COLING-2012), 8-15 December, Mumbai, (2012)
  • Marianna Apidianaki Measuring the adequacy of cross-lingual paraphrases in a Machine Translation setting. Proceedings of the 24th International Conference on Computational Linguistics (COLING-2012), 8-15 December, Mumbai, India, pp. 63--72. 2012.
  • Artem Sokolov, Guillaume Wisniewski and Fran\c{c}ois Yvon Non-linear n-best List Reranking with Few Features. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA), San Diego (CA), 2012.
  • Marianna Apidianaki, Guillaume Wisniewski, Artem Sokolov, Aurélien Max, François Yvon. WSD for n-best reranking and local language modeling in SMT. In Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation, Pages 1-9, Jeju, Republic of Korea, July 2012.
  • Artem Sokolov. LIMSI: Learning Semantic Similarity by Selecting Random Word Subsets. In *SEM 2012: The First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), Pages 543-546, Montréal, Canada, 2012. download
  • Markus Freitag, Stephan Peitz, Matthias Huck, Hermann Ney, Jan Niehues, Teresa Herrmann, Alex Waibel, Le Hai-Son, Thomas Lavergne, Alexandre Allauzen, Bianka Buschbeck, Josep Maria Crego, Jean Senellart. Joint WMT 2012 Submission of the QUAERO Project. In Proceedings of the Seventh Workshop on Statistical Machine Translation, Pages 322-329, Montréal, Canada, June 2012. download
  • Hai-Son Le, Alexandre Allauzen, François Yvon. Measuring the Influence of Long Range Dependencies with Neural Network Language Models. In Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, Pages 1-10, Montréal, Canada, 2012. download
  • Hai-Son Le, Thomas Lavergne, Alexandre Allauzen, Marianna Apidianaki, Li Gong, Aurélien Max, Artem Sokolov, Guillaume Wisniewski, François Yvon. LIMSI @ WMT12. In Proceedings of the Seventh Workshop on Statistical Machine Translation, Pages 330-337, Montréal, Canada, 2012. download
  • Qian Yu, Aurélien Max, François Yvon. Aligning Bilingual Literary Works: a Pilot Study. In Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature, Pages 36-44, Montréal, Canada, 2012. download
  • Yong Zhuang, Guillaume Wisniewski, François Yvon. Non-Linear Models for Confidence Estimation. In Proceedings of the Seventh Workshop on Statistical Machine Translation, Pages 157-162, Montréal, Canada, 2012. download
  • Hai-Son Le, Alexandre Allauzen, François Yvon. Continuous Space Translation Models with Neural Networks. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Pages 39-48, Montréal, Canada, Juin 2012. download.
  • Adrien Lardilleux, François Yvon, Yves Lepage. Hierarchical Sub-sentential Alignment with Anymalign. In Proceedings of the annual meeting of the European Association for Machine Translation, 2012.
  • Qian Yu, Aurélien Max, François Yvon. Revisiting sentence alignment algorithms for alignment visualization and evaluation. In Proceedings of the 5th Workshop on Building and Using Comparable Corpora, Istambul, Turkey, 2012.
  • Artem Sokolov, Guillaume Wisniewski, Francois Yvon. Computing Lattice BLEU Oracle Scores for Machine Translation. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Pages 120-129, Avignon, France, April 2012. download

National Conferences

  • Souhir Gahbiche-Braham, Hélène Bonneau-Maynard, Thomas Lavergne, François Yvon. Repérage des entités nommées pour l'arabe : adaptation non-supervisée et combinaison de systèmes. In Actes de la conférence conjointe JEP-TALN-RECITAL 2012, volume 2: TALN, Pages 487-494, Grenoble, France, Juin 2012. download
  • Adrien Lardilleux, François Yvon, Yves Lepage. Alignement sous-phrastique hiérarchique avec Anymalign. In Actes de la conférence conjointe JEP-TALN-RECITAL 2012, volume 2: TALN, Pages 113-126, Grenoble, France, Juin 2012. download

2011

Book Chapters

  • Alexandre Allauzen, François Yvon. Méthodes statistiques pour la traduction automatique. Dans Modèles statistiques pour l'accès à l'information textuelle, Eric Gaussier, François Yvon (eds.), Chap. 7, pp. 271-356, Hermès, Paris, 2011.

Journals

  • Josep Maria Crego, José M. Mariño, François Yvon. N-code: an open-source Bilingual N-gram SMT Toolkit. Prague Bulletin of Mathematical Linguistics, 96: pages 49-58, 2011. link
  • Adrien Lardilleux, Yves Lepage, François Yvon. The Contribution of Low Frequencies to Multilingual Sub-sentential Alignment: a Differential Associative Approach. International Journal of Advanced Intelligence, 3(2):189-217, 2011.

International Conferences

  • Thomas Lavergne, Hai-Son Le, Alexandre Allauzen, François Yvon. LIM SI's experiments in domain adaptation for IWSLT11. In Proceedings of the heigth Internation al Workshop on Spoken Language Translation (IWSLT), Mei-Yuh Hwang, Sebastian Stüker (eds.), San Francisco, CA, 2011.
  • Nadi Tomeh, Marco Turchi, Guillaume Wisniewski, Alexandre Allauzen, François Yvon. How Good Are Your Phrases? Assessing Phrase Quality with Single Class Classification. In Proceedings of the heigth International Workshop on Spoken Language Translation (IWSLT), Mei-Yuh Hwang, Sebastian Stüker (eds.), San Francisco, CA, 2011.
  • Alexandre Allauzen, Hélène Bonneau-Maynard, Hai-Son Le, Aurélien Max, Guillaume Wisniewski, François Yvon, Gilles Adda, Josep Maria Crego, Adrien Lardilleux, Thomas Lavergne, Artem Sokolov. LIMSI @ WMT11. In Proceedings of the Sixth Workshop on Statistical Machine Translation, Pages 309-315, Edinburgh, Scotland, 2011. download
  • Markus Freitag, Gregor Leusch, Joern Wuebker, Stephan Peitz, Hermann Ney, Teresa Herrmann, Jan Niehues, Alex Waibel, Alexandre Allauzen, Gilles Adda, Josep Maria Crego, Bianka Buschbeck, Tonio Wandmacher, Jean Senellart. Joint WMT Submission of the QUAERO Project. In Proceedings of the Sixth Workshop on Statistical Machine Translation, Pages 358-364, Edinburgh, Scotland, 2011. download
  • Souhir Gahbiche-Braham, Hélène Bonneau-Maynard, François Yvon. Two Ways to Use a Noisy Parallel News Corpus for Improving Statistical Machine Translation. In Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web, Pages 44-51, Portland, Oregon, 2011. download
  • Thomas Lavergne, Alexandre Allauzen, Josep Maria Crego, François Yvon. From n-gram-based to CRF-based Translation Models. In Proceedings of the Sixth Workshop on Statistical Machine Translation, Pages 542-553, Edinburgh, Scotland, 2011. download
  • Hai Son. Le, Ilya Oparin, Abdel. Messaoudi, Alexandre Allauzen, Jean-Luc Gauvain, François Yvon. Large Vocabulary SOUL Neural Network Language Models. In Proceedings of InterSpeech 2011, 2011.
  • Artem Sokolov, François Yvon. Minimum Error Rate Semi-Ring. In Proceedings of the European Conference on Machine Translation, Mikel Forcada, Heidi Depraetere (eds.), Pages 241-248, Leuven, Belgium, 2011.
  • Nadi Tomeh, Alexandre Allauzen, François Yvon. Discriminative Weighted Alignment Matrices for Statistical Machine Translation. In Proceedings of the European Conference on Machine Translation, Mikel Forcada, Heidi Depraetere (eds.), Pages 305-312, Leuven, Belgium, 2011.
  • Nadi Tomeh, Alexandre Allauzen, Thomas Lavergne, François Yvon. Designing an Improved Discriminative Word Aligner. In Proceedings of the 12th International Conference on Intelligent Text Processing and Computational Linguistics, Alexander Gelbukh (ed.), CICLING, Waseda, Japan, 2011.

National Conferences

  • Adrien Lardilleux, François Yvon, Yves Lepage. Généralisation de l'alignement sous-phrastique par échantillonnage. In Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Montpellier, 2011. (more)
  • Nadi Tomeh, Alexandre Allauzen, François Yvon. Estimation d'un modèle de traduction à partir d'alignements mot-à-mot non-déterministes. In Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Montpellier, 2011. (more)

2010

Books

  • Proceedings of the 14th Annual Conference of the European Association for Machine Translation. François Yvon, Viggo Hansen (eds.), Saint-Raphaël, France, 2010. link

Journals

  • Josep Maria Crego, François Yvon. Factored bilingual n-gram language models for statistical machine translation. Machine Translation, pages 1-17, 2010. doi
  • Josep Maria Crego, Gregor Leutsch, Aurélien Max, Hermann Ney, François Yvon. Micro-adaptation lexicale en traduction automatique statistique. Traitement Automatique des Langues, 51(2):65-93, 2010. download

International Conferences

  • Alexandre Allauzen, Josep Maria Crego, Ilknur Durgar El-Kahlout, Hai-Son Le, Guillaume Wisniewski, François Yvon. LIMSI @ IWSLT 2010. In Proceedings of the seventh International Workshop on Spoken Language Translation (IWSLT), Marcello Federico, Ian Lane, Michael Paul, François Yvon (eds.), Pages 105-112, 2010.
  • Alexandre Allauzen, Josep Maria Crego, Ilknur Durgar El-Kahlout, Francois Yvon. LIMSI's Statistical Translation Systems for WMT'10. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and Metrics MATR, Pages 54-59, Uppsala, Sweden, 2010. download
  • Josep Maria Crego, Aurélien Max, François Yvon. Local lexical adaptation in Machine Translation through triangulation: SMT helping SMT. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Pages 232-240, Beijing, China, 2010. download
  • Josep Maria Crego, François Yvon. Improving Reordering with Linguistically Informed Bilingual n-grams. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010: Posters), Pages 197-205, Beijing, China, 2010. download
  • Hai Son Le, Alexandre Allauzen, Guillaume Wisniewski, François Yvon. Training Continuous Space Language Models: Some Practical Issues. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Pages 778-788, Cambridge, MA, 2010. download
  • Aurélien Max, Josep Maria Crego, François Yvon. Contrastive Lexical Evaluation of Machine Translation. In Proceedings of the Language Resources and Evaluation Conference (LREC'10), La Valletta, Malta, 2010.
  • Nadi Tomeh, Alexandre Allauzen, Guillaume Wisniewski, François Yvon. Refining Word Alignment with Discriminative Training. In Proceedings of the ninth Conference of the Association for Machine Translation in the America (AMTA), Denver, CO, 2010.
  • Guillaume Wisniewski, Alexandre Allauzen, François Yvon. Assessing Phrase-Based Translation Models with Oracle Decoding. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Pages 933-943, Cambridge, MA, 2010. download
  • . The pay-offs of preprocessing for German-English Statistical Machine Translation. In Proceedings of the seventh International Workshop on Spoken Language Translation (IWSLT), Pages 251-258, 2010.
  • Aurélien Max. Example-Based Paraphrasing for Improved Phrase-Based Statistical Machine Translation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Pages 656-666, Cambridge, MA, October 2010. download

2009

Journals

  • Alexandre Allauzen, Guillaume Wisniewski. Modèles discriminants pour l'alignement mot à mot. Traitement Automatique des Langues, 50(3):173-203, 2009.

International Conferences

  • Philippe Langlais, François Yvon, Pierre Zweigenbaum. Improvements in Analogical Learning: Application to Translating multi-Terms of the Medical Domain. In Proceedings of the European Conference on Computational Linguistics (EACL'09), Pages 487-495, Athens, Greece, 2009. download
  • Josep Maria Crego, François Yvon. Gappy translation units under left-to-right SMT decoding. In Proceedings of the meeting of the European Association for Machine Translation (EAMT), Pages 66-73, Barcelona, Spain, 2009. pdf
  • Aurélien Max, Rafik Makhloufi, Philippe Langlais. Prise en compte de dépendances syntaxiques pour la traduction contextuelle de segments. In Proceedings of TALN, Senlis, France, 2009. pdf
  • Alexandre Allauzen, Josep Crego, Aurélien Max, François Yvon. LIMSI's Statistical Translation Systems for WMT'09. In Proceedings of the Fourth Workshop on Statistical Machine Translation, Pages 100-104, Athens, Greece, March 2009. download

National Conferences

  • Josep Maria Crego, Aurélien Max, François Yvon. Plusieurs langues (bien choisies) valent mieux qu'une: traduction statistique multi-source par renforcement lexical. In Acte de la Conférence sur le Traitement Automatique des Langues Naturelles, session poster (TALN'09), Senlis, France, 2009.

2008

International Conferences

  • Philippe Langlais, François Yvon, Pierre Zweigenbaum. Translating Medical Words by Analogy . In Proceedings of the workshop on Intelligent Data Analysis in bioMedicine and Pharmacology (IDAMAP) 2008, Washington, DC, 2008. pdf
  • Philippe Langlais, François Yvon. Scaling up analogical learning. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING 2008), Pages 49-52, Manchester, UK, 2008. download
  • Philippe Langlais, François Yvon, Pierre Zweigenbaum. Analogical translation of medical words in different languages. In Proceedings of the 6th International Conference on Natural Language Processing, GoTAL 2008 - Advances in Natural Language Processing, Lecture Notes in Computer Science, Pages 284-295, 2008. doi
  • Aurélien Max, Rafik Maklhoufi, Philippe Langlais. Explorations in Using Grammatical Dependencies for Contextual Phrase Translation Disambiguation. In Proceedings of EAMT08, poster session, Pages 114-119, Hamburg, Germany, 2008. pdf
  • Daniel Déchelotte, Gilles Adda, Alexandre Allauzen, Hélène Bonneau-Maynard, Olivier Galibert, Jean-Luc Gauvain, Philippe Langlais, François Yvon. Limsi's Statistical Translation Systems for WMT'08. In Proceedings of the Third Workshop on Statistical Machine Translation, Pages 107-110, Columbus, Ohio, June 2008. download

2007

Journals

  • Evgeny Matusov, Gregor Leusch, Rafael E. Banchs, Nicola Bertoldi, Daniel Déchelotte, Marcello Federico, Muntsin Kolss, Young-Suk Lee, José B. Mario, Matthias Paulik, Salim Roukos, Holger Schwenk, Hermann Ney.. System combination for machine translation of spoken and written language. IEEE Transactions on Audio, Speech, and Language Processing, 16(7):1222-237, 2007.

International Conferences

  • Hélène Bonneau-Maynard, Alexandre Allauzen, Daniel Déchelotte, Holger Schwenk. Combining Morphosyntactic Enriched Representation with $n$-best Reranking in Statistical Translation. In HLT/NAACL workshop on Syntax and Structure in Statistical Translation, Rochester, 2007. download
  • Daniel Déchelotte, Holger Schwenk, Hélène Bonneau-Maynard, Alexandre Allauzen, Gilles Adda. A state-of-the-art Statistical Machine Translation System based on Moses. In MT Summit, Pages 127-133, Copenhagen, 2007. pdf
  • Holger Schwenk, Daniel Déchelotte, Hélène Bonneau-Maynard, Alexandre Allauzen. Modèles statistiques enrichis par la syntaxe pour la traduction automatique. In taln, Pages 253-262, Toulouse, France, 2007.
  • Patrik Lambert, Marta R. Costa-jussá, Josep M. Crego, Maxim Khalilov, José B. Mariño, Rafael E. Banchs, José A.R. Fonollosa, Holger Schwenk. The TALP Ngram-based SMT System for IWSLT 2007. In International Workshop on Spoken Language Translation (IWSLT), Trento, 2007.

2006

International Conferences

  • Daniel Dechelotte, Holger Schwenk, Jean-Luc Gauvain. The 2006 LIMSI Statistical Machine Translati on System for TC-STAR . In TC-STAR Workshop on Speech-to-Speech Translation, Pages 25-30, Barcelona, Spain, 2006. pdf
  • Holger Schwenk, Daniel Dechelotte, Jean-Luc Gauvain. Continuous Space Language Models for Statist ical Machine Translation. In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 723-730, Sydney, Australia, 2006. pdf
  • Daniel Déchelotte, Holger Schwenk, Jean-Luc Gauvain. Transcription et traduction de débats parlementaires. In Reconnaissance des Formes et Intelligence Artificielle, Tours, January 2006.

2005

International Conferences

  • Daniel Dechelotte, Holger Schwenk, Jean-Luc Gauvain, Olivier Galibert, Lori Lamel. Investigating Translation of Parliament Speeches. In In Proceeding of IEEE Workshop on Automatic Speech Recognition, San Juan, Porto Rico, November 2005. pdf
SMT@LIMSI > People

If you would like to join us, do not hesitate to send us your CV: we are always looking for good Ph.Ds or postdoctoral research associates.

Permanent staff

Temporary staff

Past Members

  • Marco Dinarelli, was Post-doctoral research associate (2011-2013), now CNRS resarcher at LATTICE/Paris
  • Anil Kumar Singh, was Post-doctoral research associate (2012-2013) working on confidence estimation
  • Le Hai Son, (2009-2012) did is Ph.D at LIMSI, now researcher at the Vietnamese Academy of Science
  • Thomas Lavergne was Post-doctoral research associate (2009-2012), now assistant professor at Univ. Paris Sud
  • Artem Sokolov was Post-doctoral research associate (2010-2012), now research associate at Univ. of Heidelberg with S. Riezler.
  • Nadi Tomeh, did his Ph.D in LIMSI (2008-2012), is now assistant professor at Univ. Paris Nord.
  • Qian Yu was reseach associate, working on sentence alignments.
  • Adrien Lardilleux, did a Post-doc with us, now working at Affinity-Engine
  • Josep Maria-Crego did a post-doc at LIMSI, is now with Systran, in downtown Paris
  • Ilknur Durgar did a post-doc at LIMSI in 2010, and is now with Tübitaek in Turkey
  • Alassane Seck was reseach associate, working on spell checking and normalization
  • Daniel Déchelotte did his ph.D in LIMSI (2005-2008), is now with Bing, near Paris
  • Holger Schwenk is now Full Professor at Univ. du Maine, Le Mans

Interns

Visitors and collaborators

SMT@LIMSI > Recent Seminars

They have visited LIMSI in the past, so why don't you ? If you are interested, and happen to visit Paris, just drop us a mail !

  • march, 12, 2014: Jan Niehues(KIT) Adaptation in Machine Translation
  • january, 29, 2014: Stefan Rieszler(Heidelberg)
  • february, 26, 2013: Sylvain Raybaud(LORIA) Confidence measures for machine translation: evaluation, post edition and application to speech translation
  • february, 01, 2013: Pascal Fung(HK-UST) Rare Word Translation Extraction from Aligned Comparable Documents
  • november 12, 2012: Anil Kumar-Singh(LIMSI) Machine Translation as a Problem of Estimating Linguistic Similarity and the Specific Problem of Translating TAM Markers
  • july 4 2012: Simon Lacoste-Julien (Inria, Winnow) Structured alignment methods in machine learning
  • june 19, 2012: Kashif Shah (LIUM, Le mans) Domain adaptation in SMT
  • may 30, 2012: Hermann Ney (IMMI) Bayes Decision Rule and the Classification Error in Systems for HLTPR (Human Language Technology and Pattern Recognition): Results and Open Problems
  • march 03, 2012: Adrien Lardilleux (LIMSI) : Amélioration de l'alignement sous-phrastique par échantillonnage
  • feb 28, 2012: Charlotte Lecluze (GREYC) Alignement de documents multilingues sans présupposé de parallélisme
  • jan 24, 2012: Marianna Apidianaki (LIMSI) Clustering : Sémantique pour la désambiguïsation lexicale interlingue et l'évaluation de la traduction automatique
  • dec 11, 2011: Hugo Larochelle (University Sherbrooke) : Training Restricted Boltzmann Machines on Word Observations
  • july 7 2011: Marco Turchi (JRC) Multi-linguality via Statistical Machine Translation: SMT activities carried out the EC’s Joint Research Centre
  • june 9 2011: Dekai Wu (HKUST) Inversion Transduction Grammars, Linear Transduction Grammars, and Linear Inversion Transduction Grammars for SMT
  • may 2 2011: Nicola Cancedda (XRCE) Confidence-Weighted Learning of Factored Discriminative Language Models
  • december 14 2010: Hermann Ney (RWTH) Revisiting the principles of the KN method for language modelling
  • november 23 2010: Hai Son Le (LIMSI), Continuous space neural network language models
  • october 26 2010: Nadi Tomeh (LIMSI), Word Alignment for Statistical Machine Translation
  • june 29 2010: Adrien Lardilleux (GREYC) Contribution des basses fréquences à l'alignement sous-phrastique multilingue
  • march 2 2010: Dimitra Vergyri (SRI) SRI's 2-way S2S Translation system: summary of the TRANSTAC project
  • december 18 2009: Marine Carpuat (Columbia) Désambiguïsation lexicale pour une approche sémantique de la traduction automatique statistique
  • december 15 2009: Jia Xu (RWTH), Sequence segmentation and alignment for statistical machine translation
  • novembre 3 2009: Ilknur Durgar (LIMSI), A prototype English-Turkish statistical machine translation system
  • october 27 2009: Vassilina Nikoulina (XRCE), Syntax-Augmented Phrase-Based Translation
  • april 29 avril 2009: Loïc Barrault (LIUM), Combinaison de systèmes (application à la reconnaissance automatique de la parole et à la traduction statistique)
SMT@LIMSI > Projects
Some current and past projects:


Last modified: Monday,24-March-14 14:03:37 CET