1 code implementation • IWSLT 2017 • Mattia Antonino Di Gangi, Marcello Federico
When only little data exist for a language pair, the model cannot produce good representations for words, particularly for rare words.
no code implementations • AMTA 2020 • Mattia Antonino Di Gangi, Marco Gaido, Matteo Negri, Marco Turchi
Then, subword-level segmentation became the state of the art in neural machine translation as it produces shorter sequences that reduce the training time, while being superior to word-level models.
1 code implementation • 5 Aug 2020 • Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Mauro Cettolo, Marco Turchi
We show that our context-aware solution is more robust to VAD-segmented input, outperforming a strong base model and the fine-tuning on different VAD segmentations of an English-German test set by up to 4. 25 BLEU points.
no code implementations • ACL 2020 • Luisa Bentivogli, Beatrice Savoldi, Matteo Negri, Mattia Antonino Di Gangi, Roldano Cattoni, Marco Turchi
Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines.
no code implementations • WS 2020 • Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi
The test talks are provided in two versions: one contains the data already segmented with automatic tools and the other is the raw data without any segmentation.
no code implementations • 23 Oct 2019 • Mattia Antonino Di Gangi, Viet-Nhat Nguyen, Matteo Negri, Marco Turchi
Despite recent technology advancements, the effectiveness of neural approaches to end-to-end speech-to-text translation is still limited by the paucity of publicly available training corpora.
no code implementations • EMNLP (IWSLT) 2019 • Mattia Antonino Di Gangi, Robert Enyedi, Alessandra Brusadin, Marcello Federico
Our experimental results on a public speech translation data set show that adapting a model on a significant amount of parallel data including ASR transcripts is beneficial with test data of the same type, but produces a small degradation when translating clean text.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 8 Oct 2019 • Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi
Multilingual solutions are widely studied in MT and usually rely on ``\textit{target forcing}'', in which multilingual parallel data are combined to train a single model by prepending to the input sequences a language token that specifies the target language.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 24 Apr 2019 • Nicholas Ruiz, Mattia Antonino Di Gangi, Nicola Bertoldi, Marcello Federico
Machine translation systems are conventionally trained on textual resources that do not model phenomena that occur in spoken language.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
1 code implementation • 2 Apr 2019 • Mattia Antonino Di Gangi, Giosué Lo Bosco, Giovanni Pilato
Irony and sarcasm are two complex linguistic phenomena that are widely used in everyday language and especially over the social media, but they represent two serious issues for automated text understanding.
no code implementations • IWSLT (EMNLP) 2018 • Mattia Antonino Di Gangi, Roberto Dessì, Roldano Cattoni, Matteo Negri, Marco Turchi
This paper describes FBK's submission to the end-to-end English-German speech translation task at IWSLT 2018.
1 code implementation • 10 May 2018 • Mattia Antonino Di Gangi, Marcello Federico
Recurrent neural networks (RNNs) have represented for years the state of the art in neural machine translation.