Search Results for author: Junyan Li

Found 5 papers, 2 papers with code

MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World

no code implementations • 16 Jan 2024 • Yining Hong, Zishuo Zheng, Peihao Chen, Yian Wang, Junyan Li, Chuang Gan

Human beings possess the capability to multiply a melange of multisensory cues while actively exploring and interacting with the 3D world.

Language Modelling Large Language Model

Paper
Add Code

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

no code implementations • 6 Nov 2023 • Junyan Li, Delin Chen, Yining Hong, Zhenfang Chen, Peihao Chen, Yikang Shen, Chuang Gan

A communication token is generated by the LLM following a visual entity or a relation, to inform the detection network to propose regions that are relevant to the sentence generated so far.

CoLA Question Answering +5

Paper
Add Code

Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference

1 code implementation • 26 Jun 2023 • Junyan Li, Li Lyna Zhang, Jiahang Xu, Yujing Wang, Shaoguang Yan, Yunqing Xia, Yuqing Yang, Ting Cao, Hao Sun, Weiwei Deng, Qi Zhang, Mao Yang

Deploying pre-trained transformer models like BERT on downstream tasks in resource-constrained scenarios is challenging due to their high inference cost, which grows rapidly with input sequence length.

Model Compression

Paper
Code

EfficientViT: Lightweight Multi-Scale Attention for High-Resolution Dense Prediction

no code implementations • ICCV 2023 • Han Cai, Junyan Li, Muyan Hu, Chuang Gan, Song Han

Without performance loss on Cityscapes, our EfficientViT provides up to 8. 8x and 3. 8x GPU latency reduction over SegFormer and SegNeXt, respectively.

Autonomous Driving Super-Resolution

Paper
Add Code

EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction

5 code implementations • 29 May 2022 • Han Cai, Junyan Li, Muyan Hu, Chuang Gan, Song Han

Without performance loss on Cityscapes, our EfficientViT provides up to 13. 9$\times$ and 6. 2$\times$ GPU latency reduction over SegFormer and SegNeXt, respectively.

Ranked #24 on Semantic Segmentation on Cityscapes val

Autonomous Driving Image Classification +7

29,916

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.