no code implementations • 22 May 2023 • Eungbeom Kim, Yunkee Chae, Jaeheon Sim, Kyogu Lee
Since ERM utilizes the averaged performance on the data samples regardless of a group such as healthy or dysarthric speakers, ASR systems are unaware of the performance disparities across the groups.
no code implementations • 31 Oct 2022 • Eungbeom Kim, Jinhee Kim, Yoori Oh, KyungSu Kim, Minju Park, Jaeheon Sim, Jinwoo Lee, Kyogu Lee
In this paper, we aim to unveil the impact of data augmentation in audio-language multi-modal learning, which has not been explored despite its importance.
Ranked #2 on Audio to Text Retrieval on AudioCaps