Search Results for author: Wenhao Guan

Found 2 papers, 0 papers with code

FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View

no code implementations • 5 Mar 2024 • Jiawei Hou, Xiaoyan Li, Wenhao Guan, Gang Zhang, Di Feng, Yuheng Du, xiangyang xue, Jian Pu

In autonomous driving, 3D occupancy prediction outputs voxel-wise status and semantic labels for more comprehensive understandings of 3D scenes compared with traditional perception tasks, such as 3D object detection and bird's-eye view (BEV) semantic segmentation.

3D Object Detection Autonomous Driving +2

Paper
Add Code

MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis

no code implementations • 17 Dec 2023 • Wenhao Guan, Yishuang Li, Tao Li, Hukai Huang, Feng Wang, Jiayan Lin, Lingyan Huang, Lin Li, Qingyang Hong

The challenges of modeling such a multi-modal style controllable TTS mainly lie in two aspects:1)aligning the multi-modal information into a unified style space to enable the input of arbitrary modality as the style prompt in a single system, and 2)efficiently transferring the unified style representation into the given text content, thereby empowering the ability to generate prompt style-related voice.

Speech Synthesis Style Transfer +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.