1 code implementation • IWSLT (ACL) 2022 • Ioannis Tsiamas, Gerard I. Gállego, Carlos Escolano, José Fonollosa, Marta R. Costa-jussà
We further investigate the suitability of different speech encoders (wav2vec 2. 0, HuBERT) for our models and the impact of knowledge distillation from the Machine Translation model that we use for the decoder (mBART).
no code implementations • WMT (EMNLP) 2021 • Carlos Escolano, Ioannis Tsiamas, Christine Basta, Javier Ferrando, Marta R. Costa-Jussa, José A. R. Fonollosa
We fine-tune mBART50 using the filtered data, and additionally, we train a Transformer model on the same data from scratch.
1 code implementation • 16 Feb 2024 • Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà
The speech encoder seamlessly integrates with the MT model at inference, enabling direct translation from speech to text, across all languages supported by the MT model.
no code implementations • 2 Jun 2023 • Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà
Our Speech Translation systems utilize foundation models for speech (wav2vec 2. 0) and text (mBART50).
1 code implementation • 21 May 2023 • Javier Ferrando, Gerard I. Gállego, Ioannis Tsiamas, Marta R. Costa-jussà
Language Generation Models produce words based on the previous context.
1 code implementation • 19 Dec 2022 • Ioannis Tsiamas, José A. R. Fonollosa, Marta R. Costa-jussà
End-to-end Speech Translation is hindered by a lack of available data resources.
1 code implementation • 28 Oct 2022 • Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà
Transformers have been the dominant architecture for Speech Translation in recent years, achieving significant improvements in translation quality.
2 code implementations • 9 Feb 2022 • Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà
Speech translation datasets provide manual segmentations of the audios, which are not available in real-world scenarios, and existing segmentation methods usually significantly reduce translation quality at inference time.
1 code implementation • ACL (IWSLT) 2021 • Gerard I. Gállego, Ioannis Tsiamas, Carlos Escolano, José A. R. Fonollosa, Marta R. Costa-jussà
Our submission also uses a custom segmentation algorithm that employs pre-trained Wav2Vec 2. 0 for identifying periods of untranscribable text and can bring improvements of 2. 5 to 3 BLEU score on the IWSLT 2019 test set, as compared to the result with the given segmentation.
Ranked #2 on Speech-to-Text Translation on MuST-C EN->DE (using extra training data)