no code implementations • 27 May 2023 • Eliya Nachmani, Alon Levkovitch, Yifan Ding, Chulayuth Asawaroengchai, Heiga Zen, Michelle Tadmor Ramanovich
This paper presents Translatotron 3, a novel approach to unsupervised direct speech-to-speech translation from monolingual speech-text datasets by combining masked autoencoder, unsupervised embedding mapping, and back-translation.
no code implementations • 24 May 2023 • Eliya Nachmani, Alon Levkovitch, Roy Hirsch, Julian Salazar, Chulayuth Asawaroengchai, Soroosh Mariooryad, Ehud Rivlin, RJ Skerry-Ryan, Michelle Tadmor Ramanovich
Key to our approach is a training objective that jointly supervises speech recognition, text continuation, and speech synthesis using only paired speech-text pairs, enabling a `cross-modal' chain-of-thought within a single decoding pass.
no code implementations • 5 Jun 2022 • Alon Levkovitch, Eliya Nachmani, Lior Wolf
At the heart of the method lies a sampling process that combines the estimation of the denoising model with a low-pass version of the new speaker's sample.