no code implementations • IWSLT (EMNLP) 2018 • Surafel M. Lakew, Marcello Federico
In the experimental setting, an extremely low-resourced Basque-English language pair (i. e., ≈ 5. 6K in-domain training data) is our target translation task, where we considered a closely related French/Spanish-English parallel data to build the multilingual model.
no code implementations • MTSummit 2021 • Surafel M. Lakew, Matteo Negri, Marco Turchi
Neural Machine Translation (NMT) approaches employing monolingual data are showing steady improvements in resource-rich conditions.
no code implementations • IWSLT 2017 • Surafel M. Lakew, Quintino F. Lotito, Marco Turchi, Matteo Negri, Marcello Federico
Particularly, we focus on the four zero-shot directions and show how a multilingual model trained with small data can provide reasonable results.
1 code implementation • 25 Feb 2023 • Alexandra Chronopoulou, Brian Thompson, Prashant Mathur, Yogesh Virkar, Surafel M. Lakew, Marcello Federico
Automatic dubbing (AD) is the task of translating the original speech in a video into target language speech.
no code implementations • 16 Dec 2021 • Derek Tam, Surafel M. Lakew, Yogesh Virkar, Prashant Mathur, Marcello Federico
We introduce the task of isochrony-aware machine translation which aims at generating translations suitable for dubbing.
no code implementations • 16 Dec 2021 • Surafel M. Lakew, Yogesh Virkar, Prashant Mathur, Marcello Federico
Automatic dubbing (AD) is among the machine translation (MT) use cases where translations should match a given length to allow for synchronicity between source and target speech.
no code implementations • 8 Oct 2021 • Surafel M. Lakew, Marcello Federico, Yue Wang, Cuong Hoang, Yogesh Virkar, Roberto Barra-Chicote, Robert Enyedi
Automatic dubbing aims at seamlessly replacing the speech in a video document with synthetic speech in a different language.
no code implementations • 10 Mar 2021 • Surafel M. Lakew, Matteo Negri, Marco Turchi
Neural Machine Translation (NMT) approaches employing monolingual data are showing steady improvements in resource rich conditions.
1 code implementation • 31 Mar 2020 • Surafel M. Lakew, Matteo Negri, Marco Turchi
Recent advents in Neural Machine Translation (NMT) have shown improvements in low-resource language (LRL) translation tasks.
1 code implementation • EMNLP (IWSLT) 2019 • Surafel M. Lakew, Alina Karakanta, Marcello Federico, Matteo Negri, Marco Turchi
In order to improve NMT for LRL, we employ perplexity to select HRL data that are most similar to the LRL on the basis of language distance.
1 code implementation • 16 Sep 2019 • Surafel M. Lakew, Marcello Federico, Matteo Negri, Marco Turchi
In recent years, Neural Machine Translation (NMT) has been shown to be more effective than phrase-based statistical methods, thus quickly becoming the state of the art in machine translation (MT).
1 code implementation • IWSLT 2017 • Surafel M. Lakew, Quintino F. Lotito, Matteo Negri, Marco Turchi, Marcello Federico
Recent work on multilingual neural machine translation reported competitive performance with respect to bilingual models and surprisingly good performance even on (zeroshot) translation directions not observed at training time.
2 code implementations • IWSLT (EMNLP) 2018 • Surafel M. Lakew, Aliia Erofeeva, Matteo Negri, Marcello Federico, Marco Turchi
Our approach allows to extend an initial model for a given language pair to cover new languages by adapting its vocabulary as long as new data become available (i. e., introducing new vocabulary items if they are not included in the initial model).
no code implementations • WS 2018 • Surafel M. Lakew, Aliia Erofeeva, Marcello Federico
Both research and commercial machine translation have so far neglected the importance of properly handling the spelling, lexical and grammar divergences occurring among language varieties.
no code implementations • COLING 2018 • Surafel M. Lakew, Mauro Cettolo, Marcello Federico
Motivated by this, our work (i) provides a quantitative and comparative analysis of the translations produced by bilingual, multilingual and zero-shot systems; (ii) investigates the translation quality of two of the currently dominant neural architectures in MT, which are the Recurrent and the Transformer ones; and (iii) quantitatively explores how the closeness between languages influences the zero-shot translation.