1 code implementation • 13 Mar 2024 • Yatian Pang, Tanghui Jia, Yujun Shi, Zhenyu Tang, Junwu Zhang, Xinhua Cheng, Xing Zhou, Francis E. H. Tay, Li Yuan
To address this issue, we propose a novel cascade diffusion framework, which decomposes the challenging dense views generation task into two tractable stages, namely anchor views generation and anchor views interpolation.
no code implementations • 22 Feb 2024 • Bin Zhu, Munan Ning, Peng Jin, Bin Lin, Jinfa Huang, Qi Song, Junwu Zhang, Zhenyu Tang, Mingjun Pan, Xing Zhou, Li Yuan
In the multi-modal domain, the dependence of various models on specific input formats leads to user confusion and hinders progress.
2 code implementations • 29 Jan 2024 • Bin Lin, Zhenyu Tang, Yang Ye, Jiaxi Cui, Bin Zhu, Peng Jin, Jinfa Huang, Junwu Zhang, Munan Ning, Li Yuan
In this work, we propose a simple yet effective training strategy MoE-Tuning for LVLMs.
Ranked #58 on Visual Question Answering on MM-Vet
1 code implementation • 20 Dec 2023 • Junwu Zhang, Zhenyu Tang, Yatian Pang, Xinhua Cheng, Peng Jin, Yida Wei, Munan Ning, Li Yuan
The core idea is to combine the powerful image generation capability of the 2D diffusion model and the texture alignment ability of the repainting strategy for generating high-quality multi-view images with consistency.
4 code implementations • 3 Oct 2023 • Bin Zhu, Bin Lin, Munan Ning, Yang Yan, Jiaxi Cui, Hongfa Wang, Yatian Pang, Wenhao Jiang, Junwu Zhang, Zongwei Li, Wancai Zhang, Zhifeng Li, Wei Liu, Li Yuan
We thus propose VIDAL-10M with Video, Infrared, Depth, Audio and their corresponding Language, naming as VIDAL-10M.
Ranked #1 on Zero-shot Audio Classification on VGG-Sound (using extra training data)
1 code implementation • 24 Jul 2022 • Junwu Zhang, Mang Ye, Yao Yang
We further propose a progressive training strategy to improve the performance, which iteratively upgrades the initial anonymization supervision.
1 code implementation • 27 Jan 2021 • Claire Chen, Krishnan Srinivasan, Jeffrey Zhang, Junwu Zhang
We use model-based trajectory optimization and control to plan and execute these primitives.
Robotics