Search Results for author: Jan Rosendahl

Found 14 papers, 1 papers with code

Recurrent Attention for the Transformer

no code implementations • EMNLP (insights) 2021 • Jan Rosendahl, Christian Herold, Frithjof Petrick, Hermann Ney

In this work, we conduct a comprehensive investigation on one of the centerpieces of modern machine translation systems: the encoder-decoder attention mechanism.

Decoder Machine Translation +1

Paper
Add Code

Locality-Sensitive Hashing for Long Context Neural Machine Translation

no code implementations • IWSLT (ACL) 2022 • Frithjof Petrick, Jan Rosendahl, Christian Herold, Hermann Ney

After its introduction the Transformer architecture quickly became the gold standard for the task of neural machine translation.

Document Level Machine Translation Machine Translation +3

Paper
Add Code

The RWTH Aachen Machine Translation Systems for IWSLT 2017

no code implementations • IWSLT 2017 • Parnia Bahar, Jan Rosendahl, Nick Rossenbach, Hermann Ney

This work describes the Neural Machine Translation (NMT) system of the RWTH Aachen University developed for the English$German tracks of the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2017.

Domain Adaptation Machine Translation +2

Paper
Add Code

Detecting Various Types of Noise for Neural Machine Translation

no code implementations • Findings (ACL) 2022 • Christian Herold, Jan Rosendahl, Joris Vanvinckenroye, Hermann Ney

The filtering and/or selection of training data is one of the core aspects to be considered when building a strong machine translation system. In their influential work, Khayrallah and Koehn (2018) investigated the impact of different types of noise on the performance of machine translation systems. In the same year the WMT introduced a shared task on parallel corpus filtering, which went on to be repeated in the following years, and resulted in many different filtering approaches being proposed. In this work we aim to combine the recent achievements in data filtering with the original analysis of Khayrallah and Koehn (2018) and investigate whether state-of-the-art filtering systems are capable of removing all the suggested noise types. We observe that most of these types of noise can be detected with an accuracy of over 90% by modern filtering systems when operating in a well studied high resource setting. However, we also find that when confronted with more refined noise categories or when working with a less common language pair, the performance of the filtering systems is far from optimal, showing that there is still room for improvement in this area of research.

Machine Translation Translation

Paper
Add Code

Analysis of Positional Encodings for Neural Machine Translation

no code implementations • EMNLP (IWSLT) 2019 • Jan Rosendahl, Viet Anh Khoa Tran, Weiyue Wang, Hermann Ney

In this work we analyze and compare the behavior of the Transformer architecture when using different positional encoding methods.

Machine Translation Sentence +1

Paper
Add Code

Efficient Sequence Training of Attention Models using Approximative Recombination

no code implementations • 18 Oct 2021 • Nils-Philipp Wynands, Wilfried Michel, Jan Rosendahl, Ralf Schlüter, Hermann Ney

Lastly, it is shown that this technique can be used to effectively perform sequence discriminative training for attention-based encoder-decoder acoustic models on the LibriSpeech task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Towards Reinforcement Learning for Pivot-based Neural Machine Translation with Non-autoregressive Transformer

no code implementations • 27 Sep 2021 • Evgeniia Tokarchuk, Jan Rosendahl, Weiyue Wang, Pavel Petrushkov, Tomer Lancewicki, Shahram Khadivi, Hermann Ney

Pivot-based neural machine translation (NMT) is commonly used in low-resource setups, especially for translation between non-English language pairs.

Machine Translation NMT +4

Paper
Add Code

Integrated Training for Sequence-to-Sequence Models Using Non-Autoregressive Transformer

no code implementations • ACL (IWSLT) 2021 • Evgeniia Tokarchuk, Jan Rosendahl, Weiyue Wang, Pavel Petrushkov, Tomer Lancewicki, Shahram Khadivi, Hermann Ney

Complex natural language applications such as speech translation or pivot translation traditionally rely on cascaded models.

Machine Translation Translation

Paper
Add Code

Data Filtering using Cross-Lingual Word Embeddings

no code implementations • NAACL 2021 • Christian Herold, Jan Rosendahl, Joris Vanvinckenroye, Hermann Ney

While we find that our approaches come out at the top on all three tasks, different variants perform best on different tasks.

Cross-Lingual Word Embeddings Language Identification +2

Paper
Add Code

The RWTH Aachen University Machine Translation Systems for WMT 2019

no code implementations • WS 2019 • Jan Rosendahl, Christian Herold, Yunsu Kim, Miguel Gra{\c{c}}a, Weiyue Wang, Parnia Bahar, Yingbo Gao, Hermann Ney

For the De-En task, none of the tested methods gave a significant improvement over last years winning system and we end up with the same performance, resulting in 39. 6{\%} BLEU on newstest2019.

Attribute Language Modelling +3

Paper
Add Code

Learning Bilingual Sentence Embeddings via Autoencoding and Computing Similarities with a Multilayer Perceptron

no code implementations • WS 2019 • Yunsu Kim, Hendrik Rosendahl, Nick Rossenbach, Jan Rosendahl, Shahram Khadivi, Hermann Ney

We propose a novel model architecture and training algorithm to learn bilingual sentence embeddings from a combination of parallel and monolingual data.

Machine Translation Sentence +2