no code implementations • 27 May 2024 • Xinyu Zhang, Mengxue Kang, Fei Wei, Shuang Xu, Yuhe Liu, Lin Ma
By providing the diffusion models with knowledge of the generated prompt and image mask, our models generate images with a superior understanding of instructions.
no code implementations • 18 Mar 2024 • Yuhe Liu, Mengxue Kang, Zengchang Qin, Xiangxiang Chu
Experiments show that our model has achieved better logical performance, and the extracted logical knowledge can be effectively applied to other scenarios.
1 code implementation • 20 Jan 2024 • Yinchao Ma, Yuyang Tang, Wenfei Yang, Tianzhu Zhang, Jinpeng Zhang, Mengxue Kang
Single object tracking aims to locate the target object in a video sequence according to the state specified by different modal references, including the initial bounding box (BBOX), natural language (NL), or both (NL+BBOX).
no code implementations • ICCV 2023 • Mengxue Kang, Jinpeng Zhang, Jinming Zhang, Xiashuang Wang, Yang Chen, Zhe Ma, Xuhui Huang
However, previous works on feature distillation heavily rely on low-level feature information, while under-exploring the importance of high-level semantic information.