1 code implementation • 31 Jan 2024 • Jaeyeon Kim, JaeYoon Jung, Jinjoo Lee, Sang Hoon Woo
We also introduce a new training objective called masked codec modeling that improves acoustic awareness of the pretrained language model.
Ranked #1 on Audio captioning on AudioCaps
no code implementations • 24 Jun 2022 • Hyunjae Cho, Wonbin Jung, Junhyeok Lee, Sang Hoon Woo
By the difficulty of obtaining multilingual corpus for given speaker, training multilingual TTS model with monolingual corpora is unavoidable.
no code implementations • CVPR 2022 • Hyoung-Kyu Song, Sang Hoon Woo, Junhyeok Lee, Seungmin Yang, Hyunjae Cho, Youseong Lee, Dongho Choi, Kang-wook Kim
In this work, we propose a joint system combining a talking face generation system with a text-to-speech system that can generate multilingual talking face videos from only the text input.