no code implementations • 6 May 2024 • Jinwei Han, Yingguo Gao, Zhiwen Lin, Ke Yan, Shouhong Ding, Yuan Gao, Gui-Song Xia
Specifically, we introduce a Dual Attention Block (DAB) for visual-semantic relationship mining, which enriches visual information by multi-level feature fusion and conducts spatial attention for visual to semantic embedding.
no code implementations • 9 Apr 2024 • Jinwei Han, Zhiwen Lin, Zhongyisun Sun, Yingguo Gao, Ke Yan, Shouhong Ding, Yuan Gao, Gui-Song Xia
Specifically, two types of anchors are elaborated in our method, including i) text-compensated anchor which uses the images from the finetune set but enriches the text supervision from a pretrained captioner, ii) image-text-pair anchor which is retrieved from the dataset similar to pretraining data of CLIP according to the downstream task, associating with the original CLIP text with rich semantics.
no code implementations • CVPR 2022 • Xixia Xu, Yingguo Gao, Ke Yan, Xue Lin, Qi Zou
We reformulate the regression-based HPE from the perspective of classification.
1 code implementation • CVPR 2021 • Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, WeiMing Dong, Haolei Yuan, Feiyue Huang, Changsheng Xu
Weakly supervised object localization(WSOL) remains an open problem given the deficiency of finding object extent information using a classification network.