1 code implementation • 15 Apr 2024 • Shu-wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-Yi Lee
In this work, we establish the Speech processing Universal PERformance Benchmark (SUPERB) to study the effectiveness of the paradigm for speech.
no code implementations • 22 Mar 2023 • Yi-Shan Lee, Wei-Cheng Tseng, Fu-En Wang, Min Sun
We propose a content-based system for matching video and background music.
no code implementations • 1 Mar 2023 • Kai-Wei Chang, Yu-Kai Wang, Hua Shen, Iu-thing Kang, Wei-Cheng Tseng, Shang-Wen Li, Hung-Yi Lee
For speech processing, SpeechPrompt shows its high parameter efficiency and competitive performance on a few speech classification tasks.
Ranked #17 on Spoken Language Understanding on Fluent Speech Commands (using extra training data)
no code implementations • 24 Feb 2023 • Kuan-Po Huang, Tzu-hsun Feng, Yu-Kuan Fu, Tsu-Yuan Hsu, Po-Chieh Yen, Wei-Cheng Tseng, Kai-Wei Chang, Hung-Yi Lee
We tried two different aggregation techniques, layerwise-average and layerwise-concatenation, to the representations of different teacher models and found that the former was more effective.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 7 Apr 2022 • Wei-Cheng Tseng, Wei-Tsung Kao, Hung-Yi Lee
Mean opinion score (MOS) is a typical subjective evaluation metric for speech synthesis systems.
1 code implementation • 31 Mar 2022 • Kai-Wei Chang, Wei-Cheng Tseng, Shang-Wen Li, Hung-Yi Lee
We report in this paper the first exploration of the prompt tuning paradigm for speech processing tasks based on Generative Spoken Language Model (GSLM).
no code implementations • 1 Feb 2022 • Wei-Cheng Tseng, Hung-Ju Liao, Lin Yen-Chen, Min Sun
We propose CLA-NeRF -- a Category-Level Articulated Neural Radiance Field that can perform view synthesis, part segmentation, and articulated pose estimation.
no code implementations • 14 Dec 2021 • Wei-Cheng Tseng, Wei Wei, Da-Cheng Juan, Min Sun
The number of agents can grow or an environment sometimes needs to interact with a changing number of agents in real-world scenarios.
no code implementations • 1 Dec 2021 • Wei-Cheng Tseng, Po-Han Chi, Jia-Hua Wu, Min Sun
In contrast, most of the existing methods delete the rare protein functions to reduce the label space.
1 code implementation • 9 Nov 2021 • Wei-Cheng Tseng, Wei-Tsung Kao, Hung-Yi Lee
Recently, adapting the idea of self-supervised learning (SSL) on continuous speech has started gaining attention.
5 code implementations • 3 May 2021 • Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-Yi Lee
SUPERB is a leaderboard to benchmark the performance of a shared model across a wide range of speech processing tasks with minimal architecture changes and labeled data.
6 code implementations • 7 Apr 2021 • Wei-Cheng Tseng, Chien-yu Huang, Wei-Tsung Kao, Yist Y. Lin, Hung-Yi Lee
In this paper, we use self-supervised pre-trained models for MOS prediction.
no code implementations • 4 Mar 2021 • Wei-Cheng Tseng, Jin-Siang Lin, Yao-Min Feng, Min Sun
We also design two regularization terms to improve the diversity and utilization rate of the primitives in the pre-training phase.