1 code implementation • 29 May 2024 • Yingqing He, Zhaoyang Liu, Jingye Chen, Zeyue Tian, Hongyu Liu, Xiaowei Chi, Runtao Liu, Ruibin Yuan, Yazhou Xing, Wenhai Wang, Jifeng Dai, Yong Zhang, Wei Xue, Qifeng Liu, Yike Guo, Qifeng Chen
With the recent advancement in large language models (LLMs), there is a growing interest in combining LLMs with multimodal learning.
no code implementations • 27 Feb 2024 • Yazhou Xing, Yingqing He, Zeyue Tian, Xintao Wang, Qifeng Chen
Thus, instead of training the giant models from scratch, we propose to bridge the existing strong models with a shared latent representation space.
no code implementations • 2 Dec 2023 • Qiang Wen, Yazhou Xing, Zhefan Rao, Qifeng Chen
Specifically, to tailor the pre-trained latent diffusion model to operate on the RAW domain, we train a set of lightweight taming modules to inject the RAW information into the diffusion denoising process via modulating the intermediate features of UNet.
no code implementations • 29 Aug 2023 • Yazhou Xing, Amrita Mazumdar, Anjul Patney, Chao Liu, Hongxu Yin, Qifeng Chen, Jan Kautz, Iuri Frosio
We present a learning-based system to reduce these artifacts without resorting to complex acquisition mechanisms like alternating exposures or costly processing that are typical of high dynamic range (HDR) imaging.
1 code implementation • 27 Jan 2022 • Chenyang Lei, Yazhou Xing, Hao Ouyang, Qifeng Chen
A progressive propagation strategy with pseudo labels is also proposed to enhance DVP's performance on video propagation.
1 code implementation • 7 Aug 2021 • Yingqing He, Yazhou Xing, Tianjia Zhang, Qifeng Chen
Qualitative and quantitative experiments on a real-world portrait shadow dataset demonstrate that our approach achieves comparable performance with supervised shadow removal methods.
1 code implementation • CVPR 2021 • Yazhou Xing, Zian Qian, Qifeng Chen
Unprocessed RAW data is a highly valuable image format for image editing and computer vision.
2 code implementations • NeurIPS 2020 • Chenyang Lei, Yazhou Xing, Qifeng Chen
Extensive quantitative and perceptual experiments show that our approach obtains superior performance than state-of-the-art methods on blind video temporal consistency.