no code implementations • 15 May 2024 • Kejia Zhang, Lan Zhang, Haiwei Pan, Baolong Yu
Compared to diffusion models, consistency models can reduce the sampling times to once, not only achieving similar generative effects but also significantly speeding up training and prediction.
1 code implementation • 12 Apr 2024 • Zhe Li, Haiwei Pan, Kejia Zhang, Yuhua Wang, Fengming Yu
Multi-modality image fusion (MMIF) aims to integrate complementary information from different modalities into a single fused image to represent the imaging scene and facilitate downstream visual tasks comprehensively.
1 code implementation • 14 Aug 2023 • Kejia Zhang, Mingyu Yang, Stephen D. J. Lang, Alistair M. McInnes, Richard B. Sherley, Tilo Burghardt
In this paper, we publish an animal-borne underwater video dataset of penguins and introduce a ready-to-deploy deep learning system capable of robustly detecting penguins (mAP50@98. 0%) and also instances of fish (mAP50@73. 3%).
no code implementations • 3 Jan 2022 • Zhe Zhang, Shiyao Ma, Zhaohui Yang, Zehui Xiong, Jiawen Kang, Yi Wu, Kejia Zhang, Dusit Niyato
This emerging technology relies on sharing ground truth labeled data between Unmanned Aerial Vehicle (UAV) swarms to train a high-quality automatic image recognition model.
no code implementations • 7 Jul 2021 • Haiwei Pan, Shuning He, Kejia Zhang, Bo Qu, Chunling Chen, Kun Shi
Since most current medical VQA models focus on visual content, ignoring the importance of text, this paper proposes a multi-view attention-based model(MuVAM) for medical visual question answering which integrates the high-level semantics of medical images on the basis of text description.