no code implementations • 16 Jun 2021 • Zhichao Wang, Xinyong Zhou, Fengyu Yang, Tao Li, Hongqiang Du, Lei Xie, Wendong Gan, Haitao Chen, Hai Li
Specifically, prosodic features are used to explicit model prosody, while VAE and reference encoder are used to implicitly model prosody, which take Mel spectrum and bottleneck feature as input respectively.
no code implementations • 29 Oct 2019 • Xinyong Zhou, Hao Che, Xiaorui Wang, Lei Xie
In this paper, we present a cross-lingual voice cloning approach.