Search Results for author: Jinliang Zheng

Found 3 papers, 2 papers with code

GLID: Pre-training a Generalist Encoder-Decoder Vision Model

no code implementations • 11 Apr 2024 • Jihao Liu, Jinliang Zheng, Yu Liu, Hongsheng Li

This paper proposes a GeneraLIst encoder-Decoder (GLID) pre-training method for better handling various downstream computer vision tasks.

Decoder Depth Estimation +6

Paper
Add Code

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

1 code implementation • 28 Feb 2024 • Jianxiong Li, Jinliang Zheng, Yinan Zheng, Liyuan Mao, Xiao Hu, Sijie Cheng, Haoyi Niu, Jihao Liu, Yu Liu, Jingjing Liu, Ya-Qin Zhang, Xianyuan Zhan

Multimodal pretraining has emerged as an effective strategy for the trinity of goals of representation learning in autonomous robots: 1) extracting both local and global task progression information; 2) enforcing temporal consistency of visual representation; 3) capturing trajectory-level language grounding.

Contrastive Learning Decision Making +1

Paper
Code

MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers

1 code implementation • CVPR 2023 • Jihao Liu, Xin Huang, Jinliang Zheng, Yu Liu, Hongsheng Li

In this paper, we propose Mixed and Masked AutoEncoder (MixMAE), a simple but efficient pretraining method that is applicable to various hierarchical Vision Transformers.

Ranked #2 on Image Classification on Places205

Image Classification Object Detection +2

123

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.