Search Results for author: Jinliang Zheng

Found 3 papers, 2 papers with code

GLID: Pre-training a Generalist Encoder-Decoder Vision Model

no code implementations11 Apr 2024 Jihao Liu, Jinliang Zheng, Yu Liu, Hongsheng Li

This paper proposes a GeneraLIst encoder-Decoder (GLID) pre-training method for better handling various downstream computer vision tasks.

Decoder Depth Estimation +6

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

1 code implementation28 Feb 2024 Jianxiong Li, Jinliang Zheng, Yinan Zheng, Liyuan Mao, Xiao Hu, Sijie Cheng, Haoyi Niu, Jihao Liu, Yu Liu, Jingjing Liu, Ya-Qin Zhang, Xianyuan Zhan

Multimodal pretraining has emerged as an effective strategy for the trinity of goals of representation learning in autonomous robots: 1) extracting both local and global task progression information; 2) enforcing temporal consistency of visual representation; 3) capturing trajectory-level language grounding.

Contrastive Learning Decision Making +1

MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers

1 code implementation CVPR 2023 Jihao Liu, Xin Huang, Jinliang Zheng, Yu Liu, Hongsheng Li

In this paper, we propose Mixed and Masked AutoEncoder (MixMAE), a simple but efficient pretraining method that is applicable to various hierarchical Vision Transformers.

Image Classification Object Detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.