Search Results for author: Zhuofan Xia

Found 8 papers, 7 papers with code

GSVA: Generalized Segmentation via Multimodal Large Language Models

no code implementations15 Dec 2023 Zhuofan Xia, Dongchen Han, Yizeng Han, Xuran Pan, Shiji Song, Gao Huang

Generalized Referring Expression Segmentation (GRES) extends the scope of classic RES to refer to multiple objects in one expression or identify the empty targets absent in the image.

Decoder Generalized Referring Expression Segmentation +2

Agent Attention: On the Integration of Softmax and Linear Attention

2 code implementations14 Dec 2023 Dongchen Han, Tianzhu Ye, Yizeng Han, Zhuofan Xia, Shiji Song, Gao Huang

Specifically, the Agent Attention, denoted as a quadruple $(Q, A, K, V)$, introduces an additional set of agent tokens $A$ into the conventional attention module.

Computational Efficiency Image Classification +4

Generalized Activation via Multivariate Projection

1 code implementation29 Sep 2023 Jiayun Li, Yuxiao Cheng, Yiwen Lu, Zhuofan Xia, Yilin Mo, Gao Huang

Activation functions are essential to introduce nonlinearity into neural networks, with the Rectified Linear Unit (ReLU) often favored for its simplicity and effectiveness.

DAT++: Spatially Dynamic Vision Transformer with Deformable Attention

1 code implementation4 Sep 2023 Zhuofan Xia, Xuran Pan, Shiji Song, Li Erran Li, Gao Huang

On the one hand, using dense attention in ViT leads to excessive memory and computational cost, and features can be influenced by irrelevant parts that are beyond the region of interests.

Image Classification Instance Segmentation +2

Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention

1 code implementation CVPR 2023 Xuran Pan, Tianzhu Ye, Zhuofan Xia, Shiji Song, Gao Huang

Self-attention mechanism has been a key factor in the recent progress of Vision Transformer (ViT), which enables adaptive feature extraction from global contexts.

feature selection Inductive Bias

Adaptive Rotated Convolution for Rotated Object Detection

1 code implementation ICCV 2023 Yifan Pu, Yiru Wang, Zhuofan Xia, Yizeng Han, Yulin Wang, Weihao Gan, Zidong Wang, Shiji Song, Gao Huang

In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images, and an efficient conditional computation mechanism is introduced to accommodate the large orientation variations of objects within an image.

Ranked #3 on Object Detection In Aerial Images on DOTA (using extra training data)

Object object-detection +2

Vision Transformer with Deformable Attention

2 code implementations CVPR 2022 Zhuofan Xia, Xuran Pan, Shiji Song, Li Erran Li, Gao Huang

On the one hand, using dense attention e. g., in ViT, leads to excessive memory and computational cost, and features can be influenced by irrelevant parts which are beyond the region of interests.

Image Classification Object Detection +1

3D Object Detection with Pointformer

1 code implementation CVPR 2021 Xuran Pan, Zhuofan Xia, Shiji Song, Li Erran Li, Gao Huang

In this paper, we propose Pointformer, a Transformer backbone designed for 3D point clouds to learn features effectively.

3D Object Detection Object +2

Cannot find the paper you are looking for? You can Submit a new open access paper.