no code implementations • 19 Feb 2024 • Jingpu Yang, Zehua Han, Mengyu Xiang, Helin Wang, Yuxiao Huang, Miao Fang
With the rapid advancement of Neural Machine Translation (NMT), enhancing translation efficiency and quality has become a focal point of research.
1 code implementation • 26 Dec 2023 • Jingpu Yang, Qirui Zhao, Helin Wang, Yuxiao Huang, Zirui Song, Miao Fang
OPARL employs a unique dual-actor approach: an optimistic actor dedicated to exploration and a pessimistic actor focused on utilization, thereby effectively differentiating between exploration and utilization strategies.
1 code implementation • 16 Nov 2023 • Helin Wang, Venkatesh Ravichandran, Milind Rao, Becky Lammers, Myra Sydnor, Nicholas Maragakis, Ankur A. Butala, Jayne Zhang, Lora Clawson, Victoria Chovaz, Laureano Moro-Velazquez
Spoken language understanding (SLU) systems often exhibit suboptimal performance in processing atypical speech, typically caused by neurological conditions and motor impairments.
1 code implementation • 6 Oct 2023 • Jiarui Hai, Helin Wang, Dongchao Yang, Karan Thakkar, Najim Dehak, Mounya Elhilali
Common target sound extraction (TSE) approaches primarily relied on discriminative approaches in order to separate the target sound while minimizing interference from the unwanted sources, with varying success in separating the target from the background.
1 code implementation • 18 Jun 2023 • Helin Wang, Thomas Thebaud, Jesus Villalba, Myra Sydnor, Becky Lammers, Najim Dehak, Laureano Moro-Velazquez
We present a novel typical-to-atypical voice conversion approach (DuTa-VC), which (i) can be trained with nonparallel data (ii) first introduces diffusion probabilistic model (iii) preserves the target speaker identity (iv) is aware of the phoneme duration of the target speaker.
1 code implementation • 5 Jun 2023 • Junling Liu, Peilin Zhou, Yining Hua, Dading Chong, Zhongyu Tian, Andrew Liu, Helin Wang, Chenyu You, Zhenhua Guo, Lei Zhu, Michael Lingzhi Li
To the best of our knowledge, CMExam is the first Chinese medical exam dataset to provide comprehensive medical annotations.
1 code implementation • 20 Jul 2022 • Dongchao Yang, Jianwei Yu, Helin Wang, Wen Wang, Chao Weng, Yuexian Zou, Dong Yu
In this study, we investigate generating sound conditioned on a text prompt and propose a novel text-to-sound generation framework that consists of a text encoder, a Vector Quantized Variational Autoencoder (VQ-VAE), a decoder, and a vocoder.
Ranked #13 on Audio Generation on AudioCaps
no code implementations • 23 May 2022 • Peilin Zhou, Dading Chong, Helin Wang, Qingcheng Zeng
The past ten years have witnessed the rapid development of text-based intent detection, whose benchmark performances have already been taken to a remarkable level by deep learning techniques.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 4 Jul 2021 • Zhiqi Huang, Fenglin Liu, Xian Wu, Shen Ge, Helin Wang, Wei Fan, Yuexian Zou
As a result, the proposed approach can handle various tasks including: Audio-Oriented Multimodal Machine Comprehension, Machine Reading Comprehension and Machine Listening Comprehension, in a single model, making fair comparisons possible between our model and the existing unimodal MC models.
no code implementations • 8 Apr 2021 • Jinchuan Tian, Rongzhi Gu, Helin Wang, Yuexian Zou
Transformer-based self-supervised models are trained as feature extractors and have empowered many downstream speech tasks to achieve state-of-the-art performance.
no code implementations • 31 Mar 2021 • Helin Wang, Yuexian Zou, Wenwu Wang
In this paper, we present SpecAugment++, a novel data augmentation method for deep neural networks based acoustic scene classification (ASC).
no code implementations • 31 Mar 2021 • Helin Wang, Bo Wu, LianWu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu
In this paper, we exploit the effective way to leverage contextual information to improve the speech dereverberation performance in real-world reverberant environments.
no code implementations • 14 Dec 2019 • Helin Wang, Yuexian Zou, Dading Chong, Wenwu Wang
Convolutional neural networks (CNN) are one of the best-performing neural network architectures for environmental sound classification (ESC).
Acoustic Scene Classification Environmental Sound Classification +3