no code implementations • 16 Jan 2024 • Jiyang Tang, Kwangyoun Kim, Suwon Shon, Felix Wu, Prashant Sridhar, Shinji Watanabe
Compared to studies with similar motivations, the proposed loss operates directly on the cross attention weights and is easier to implement.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
2 code implementations • 19 May 2023 • Jiyang Tang, William Chen, Xuankai Chang, Shinji Watanabe, Brian MacWhinney
Our system achieves state-of-the-art speaker-level detection accuracy (97. 3%), and a relative WER reduction of 11% for moderate Aphasia patients.
2 code implementations • 18 May 2023 • Yifan Peng, Kwangyoun Kim, Felix Wu, Brian Yan, Siddhant Arora, William Chen, Jiyang Tang, Suwon Shon, Prashant Sridhar, Shinji Watanabe
Conformer, a convolution-augmented Transformer variant, has become the de facto encoder architecture for speech processing due to its superior performance in various tasks, including automatic speech recognition (ASR), speech translation (ST) and spoken language understanding (SLU).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 12 Apr 2021 • Jiyang Tang, Ming Li
In this paper, we propose an end-to-end Mandarin tone classification method from continuous speech utterances utilizing both the spectrogram and the short-term context information as the input.