Search Results for author: Binqiang Huang

A Progressive Framework of Vision-language Knowledge Distillation and Alignment for Multilingual Scene

Pre-trained vision-language (V-L) models such as CLIP have shown excellent performance in many downstream cross-modal tasks.

Paper
Add Code

Object counting is a challenging task with broad application prospects in security surveillance, traffic management, and disease diagnosis.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.