no code implementations • CLIB 2022 • Iglika Nikolova-Stoupak, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi
The corpus utilised to train machine translation models in the study is CCMatrix, provided by OPUS.
no code implementations • 21 May 2024 • Sirou Chen, Sakiko Yahata, Shuichiro Shimizu, Zhengdong Yang, Yihang Li, Chenhui Chu, Sadao Kurohashi
Emotion plays a crucial role in human conversation.
no code implementations • 18 Jan 2024 • Hao Wang, Shuhei Kurita, Shuichiro Shimizu, Daisuke Kawahara
Audio-visual speech recognition (AVSR) is a multimodal extension of automatic speech recognition (ASR), using video as a complement to audio.
Audio-Visual Speech Recognition Automatic Speech Recognition +4
1 code implementation • 31 Oct 2023 • Yihang Li, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi, Wei Li
In addition to the extensive training set, EVA contains a video-helpful evaluation set in which subtitles are ambiguous, and videos are guaranteed helpful for disambiguation.
1 code implementation • 16 May 2023 • Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi
We present a new task, speech dialogue translation mediating speakers of different languages.
1 code implementation • LREC 2022 • Yihang Li, Shuichiro Shimizu, Weiqi Gu, Chenhui Chu, Sadao Kurohashi
Existing multimodal machine translation (MMT) datasets consist of images and video captions or general subtitles, which rarely contain linguistic ambiguity, making visual information not so effective to generate appropriate translations.