Search Results for author: Zhifang Guo

Found 5 papers, 1 papers with code

PromptTTS 2: Describing and Generating Voices with Text Prompt

no code implementations • 5 Sep 2023 • Yichong Leng, Zhifang Guo, Kai Shen, Xu Tan, Zeqian Ju, Yanqing Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiang-Yang Li, Sheng Zhao, Tao Qin, Jiang Bian

TTS approaches based on the text prompt face two main challenges: 1) the one-to-many problem, where not all details about voice variability can be described in the text prompt, and 2) the limited availability of text prompt datasets, where vendors and large cost of data labeling are required to write text prompts for speech.

Language Modelling Large Language Model

Paper
Add Code

Audio Generation with Multiple Conditional Diffusion Model

no code implementations • 23 Aug 2023 • Zhifang Guo, Jianguo Mao, Rui Tao, Long Yan, Kazushige Ouchi, Hong Liu, Xiangdong Wang

To address this issue, we propose a novel model that enhances the controllability of existing pre-trained text-to-audio models by incorporating additional conditions including content (timestamp) and style (pitch contour and energy contour) as supplements to the text.

Audio Generation Language Modelling +1

Paper
Add Code

Furnishing Sound Event Detection with Language Model Abilities

no code implementations • 22 Aug 2023 • Hualei Wang, Jianguo Mao, Zhifang Guo, Jiarui Wan, Hong Liu, Xiangdong Wang

Recently, the ability of language models (LMs) has attracted increasing attention in visual cross-modality.

Decoder Event Detection +2

Paper
Add Code

PromptTTS: Controllable Text-to-Speech with Text Descriptions

no code implementations • 22 Nov 2022 • Zhifang Guo, Yichong Leng, Yihan Wu, Sheng Zhao, Xu Tan

Thus, we develop a text-to-speech (TTS) system (dubbed as PromptTTS) that takes a prompt with both style and content descriptions as input to synthesize the corresponding speech.

Decoder Speech Synthesis

Paper
Add Code

A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4

1 code implementation • 18 Oct 2022 • Yiming Li, Zhifang Guo, Zhirong Ye, Xiangdong Wang, Hong Liu, Yueliang Qian, Rui Tao, Long Yan, Kazushige Ouchi

For the frame-wise model, the ICT-TOSHIBA system of DCASE 2021 Task 4 is used.

Event Detection Metric Learning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.