no code implementations • ECCV 2020 • Sibei Yang, Guanbin Li, Yizhou Yu
Phrase level visual grounding aims to locate in an image the corresponding visual regions referred to by multiple noun phrases in a given sentence.
no code implementations • 24 Apr 2024 • Yang Liu, Binglin Chen, Yongsen Zheng, Guanbin Li, Liang Lin
Specifically, our ODMixer has double-branch structure and involves the Channel Mixer, the Multi-view Mixer, and the Bidirectional Trend Learner.
no code implementations • 8 Apr 2024 • Jiacheng Zhang, Jie Wu, Yuxi Ren, Xin Xia, Huafeng Kuang, Pan Xie, Jiashi Li, Xuefeng Xiao, Weilin Huang, Min Zheng, Lean Fu, Guanbin Li
Diffusion models have revolutionized the field of image generation, leading to the proliferation of high-quality models and diverse downstream applications.
no code implementations • 26 Mar 2024 • Ganlong Zhao, Guanbin Li, Weikai Chen, Yizhou Yu
Recent advances in Iterative Vision-and-Language Navigation (IVLN) introduce a more meaningful and practical paradigm of VLN by maintaining the agent's memory across tours of scenes.
no code implementations • 26 Mar 2024 • Jiacheng Zhang, Jiaming Li, Xiangru Lin, Wei zhang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li
Additionally, we present a DepthGradient Projection (DGP) module to mitigate optimization conflicts caused by noisy depth supervision of pseudo-labels, effectively decoupling the depth gradient and removing conflicting gradients.
no code implementations • 26 Mar 2024 • Jiahao Chen, Yipeng Qin, Lingjie Liu, Jiangbo Lu, Guanbin Li
Neural Radiance Field (NeRF) has been widely recognized for its excellence in novel view synthesis and 3D scene reconstruction.
1 code implementation • ICCV 2023 • Jiaming Li, Xiangru Lin, Wei zhang, Xiao Tan, YingYing Li, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li
To tackle the confirmation bias from incorrect pseudo labels of minority classes, the class-rebalancing sampling module resamples unlabeled data following the guidance of the gradient-based reweighting module.
no code implementations • 21 Mar 2024 • Duojun Huang, Xinyu Xiong, De-Jun Fan, Feng Gao, Xiao-Jian Wu, Guanbin Li
To minimize annotation costs, we propose a deep active learning framework for annotation-efficient polyp segmentation.
no code implementations • 14 Mar 2024 • Xinyu Xiong, Churan Wang, Wenxue Li, Guanbin Li
Accurate identification of breast masses is crucial in diagnosing breast cancer; however, it can be challenging due to their small size and being camouflaged in surrounding normal glands.
no code implementations • 9 Mar 2024 • Hairong Shi, Songhao Han, Shaofei Huang, Yue Liao, Guanbin Li, Xiangxing Kong, Hua Zhu, Xiaomu Wang, Si Liu
Tumor lesion segmentation on CT or MRI images plays a critical role in cancer diagnosis and treatment planning.
no code implementations • 23 Feb 2024 • Junlin Xie, Zhihong Chen, Ruifei Zhang, Xiang Wan, Guanbin Li
In this paper, we conduct a systematic review of LLM-driven multimodal agents, which we refer to as large multimodal agents ( LMAs for short).
1 code implementation • 20 Feb 2024 • Junjia Huang, Haofeng Li, Xiang Wan, Guanbin Li
The recognition of multi-class cell nuclei can significantly facilitate the process of histopathological diagnosis.
1 code implementation • 20 Feb 2024 • Wei Lou, Guanbin Li, Xiang Wan, Haofeng Li
Nuclei classification is a critical step in computer-aided diagnosis with histopathology images.
1 code implementation • 1 Feb 2024 • Yang Liu, Xinshuai Song, Kaixuan Jiang, Weixing Chen, Jingzhou Luo, Guanbin Li, Liang Lin
To overcome this limitation, we introduce the Multimodal Embodied Interactive Agent (MEIA), capable of translating high-level tasks expressed in natural language into a sequence of executable actions.
no code implementations • 26 Jan 2024 • Jingyu Zhuang, Di Kang, Yan-Pei Cao, Guanbin Li, Liang Lin, Ying Shan
To this end, we propose a 3D scene editing framework, TIPEditor, that accepts both text and image prompts and a 3D bounding box to specify the editing region.
no code implementations • 21 Jan 2024 • Jichang Li, Guanbin Li, Yizhou Yu
However, existing SSDA work fails to make full use of label information from both source and target domains for feature alignment across domains, resulting in label mismatch in the label space during model testing.
Semi-supervised Domain Adaptation Unsupervised Domain Adaptation
no code implementations • 21 Jan 2024 • Jichang Li, Guanbin Li, Yizhou Yu
Once the graph has been refined, Adaptive Betweenness Clustering is introduced to facilitate semantic transfer by using across-domain betweenness clustering and within-domain betweenness clustering, thereby propagating semantic label information from labeled samples across domains to unlabeled target data.
1 code implementation • 10 Jan 2024 • Yuncheng Jiang, Zixun Zhang, Yiwen Hu, Guanbin Li, Xiang Wan, Song Wu, Shuguang Cui, Silin Huang, Zhen Li
Accurate polyp detection is critical for early colorectal cancer diagnosis.
no code implementations • 1 Jan 2024 • Jingyu Zhuang, Kuo Wang, Liang Lin, Guanbin Li
Credible Teacher adopts an interactive teaching mechanism using flexible labels to prevent uncertain pseudo labels from misleading the model and gradually reduces its uncertainty through the guidance of other credible pseudo labels.
no code implementations • 22 Dec 2023 • Chaowei Fang, Ziyin Zhou, Junye Chen, Hanjing Su, Qingyao Wu, Guanbin Li
We introduce a novel method, Variance-Insensitive and Target-Preserving Mask Refinement to enhance segmentation quality with fewer user inputs.
no code implementations • 22 Dec 2023 • Yicheng Leng, Chaowei Fang, Gen Li, Yixiang Fang, Guanbin Li
Visible watermarks, while instrumental in protecting image copyrights, frequently distort the underlying content, complicating tasks like scene interpretation and image editing.
1 code implementation • 19 Dec 2023 • Jichang Li, Guanbin Li, Hui Cheng, Zicheng Liao, Yizhou Yu
However, these prior methods do not learn noise filters by exploiting knowledge across all clients, leading to sub-optimal and inferior noise filtering performance and thus damaging training stability.
no code implementations • 12 Dec 2023 • Haiming Zhang, Zhihao Yuan, Chaoda Zheng, Xu Yan, Baoyuan Wang, Guanbin Li, Song Wu, Shuguang Cui, Zhen Li
Our proposed GSmoothFace model mainly consists of the Audio to Expression Prediction (A2EP) module and the Target Adaptive Face Translation (TAFT) module.
1 code implementation • 7 Dec 2023 • Yong liu, Sule Bai, Guanbin Li, Yitong Wang, Yansong Tang
We attribute this to the in-vocabulary embedding and domain-biased CLIP prediction.
no code implementations • 4 Dec 2023 • Runze He, Shaofei Huang, Xuecheng Nie, Tianrui Hui, Luoqi Liu, Jiao Dai, Jizhong Han, Guanbin Li, Si Liu
In this paper, we target the adaptive source driven 3D scene editing task by proposing a CustomNeRF model that unifies a text description or a reference image as the editing prompt.
1 code implementation • 22 Oct 2023 • Xinyi Yu, Guanbin Li, Wei Lou, SiQi Liu, Xiang Wan, Yan Chen, Haofeng Li
Therefore, augmenting a dataset with only a few labeled images to improve the segmentation performance is of significant research and application value.
1 code implementation • ICCV 2023 • Junjia Huang, Haofeng Li, Xiang Wan, Guanbin Li
Multi-class cell nuclei detection is a fundamental prerequisite in the diagnosis of histopathology.
1 code implementation • 22 Oct 2023 • Junjia Huang, Haofeng Li, Weijun Sun, Xiang Wan, Guanbin Li
Automatic nuclei detection and classification can produce effective information for disease diagnosis.
1 code implementation • 22 Oct 2023 • Wei Lou, Xinyi Yu, Chenyu Liu, Xiang Wan, Guanbin Li, SiQi Liu, Haofeng Li
Afterward, we train a separate segmentation model for each category using the images in the corresponding category.
no code implementations • 9 Oct 2023 • Guanqi Chen, Guanbin Li
Cardiac function assessment aims at predicting left ventricular ejection fraction (LVEF) given an echocardiogram video, which requests models to focus on the changes in the left ventricle during the cardiac cycle.
1 code implementation • 3 Sep 2023 • Yuhao Du, Yuncheng Jiang, Shuangyi Tan, Xusheng Wu, Qi Dou, Zhen Li, Guanbin Li, Xiang Wan
Colonoscopy analysis, particularly automatic polyp segmentation and detection, is essential for assisting clinical diagnosis and treatment.
1 code implementation • 22 Aug 2023 • Tao Chen, Ze Lin, Hui Li, Jiayi Ji, Yiyi Zhou, Guanbin Li, Rongrong Ji
Furthermore, we model product attributes based on both text and image modalities so that multi-modal product characteristics can be manifested in the generated summaries.
no code implementations • 20 Aug 2023 • Dongjian Huo, Zehong Zhang, Hanjing Su, Guanbin Li, Chaowei Fang, Qingyao Wu
Existing watermark removal methods mainly rely on UNet with task-specific decoder branches--one for watermark localization and the other for background image restoration.
1 code implementation • CVPR 2023 • Zhihong Chen, Ruifei Zhang, Yibing Song, Xiang Wan, Guanbin Li
Therefore, in this paper, we propose a novel benchmark of \underline{S}cene \underline{K}nowledge-guided \underline{V}isual \underline{G}rounding (SK-VG), where the image content and referring expressions are not sufficient to ground the target objects, forcing the models to have a reasoning ability on the long-form scene knowledge.
1 code implementation • CVPR 2023 • Duojun Huang, Jichang Li, Weikai Chen, Junshi Huang, Zhenhua Chai, Guanbin Li
To accommodate active learning and domain adaption, the two naturally different tasks, in a collaborative framework, we advocate that a customized learning strategy for the target data is the key to the success of ADA solutions.
1 code implementation • ICCV 2023 • Zunnan Xu, Zhihong Chen, Yong Zhang, Yibing Song, Xiang Wan, Guanbin Li
Parameter Efficient Tuning (PET) has gained attention for reducing the number of parameters while maintaining performance and providing better hardware resource savings, but few studies investigate dense prediction tasks and interaction between modalities.
Ranked #2 on Referring Expression Segmentation on RefCOCO
2 code implementations • CVPR 2023 • Ganlong Zhao, Guanbin Li, Yipeng Qin, Yizhou Yu
In this paper, we propose a novel dataset condensation method based on distribution matching, which is more efficient and promising.
1 code implementation • ICCV 2023 • Hong Yan, Yang Liu, Yushen Wei, Zhen Li, Guanbin Li, Liang Lin
Moreover, these methods ignore how to utilize the fine-grained dependencies among different skeleton joints to pre-train an efficient skeleton sequence learning model that can generalize well across different datasets.
3 code implementations • CVPR 2023 • Jiacheng Zhang, Xiangru Lin, Wei zhang, Kuo Wang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li
Specifically, we propose a Stage-wise Hybrid Matching strategy that combines the one-to-many assignment and one-to-one assignment strategies to improve the training efficiency of the first stage and thus provide high-quality pseudo labels for the training of the second stage.
no code implementations • 7 Jul 2023 • Zizheng Yan, Yushuang Wu, Yipeng Qin, Xiaoguang Han, Shuguang Cui, Guanbin Li
In this paper, we introduce a realistic and challenging domain adaptation problem called Universal Semi-supervised Model Adaptation (USMA), which i) requires only a pre-trained source model, ii) allows the source and target domain to have different label sets, i. e., they share a common label set and hold their own private label set, and iii) requires only a few labeled samples in each class of the target domain.
no code implementations • 30 Jun 2023 • Ganlong Zhao, Guanbin Li, Yipeng Qin, Jinjin Zhang, Zhenhua Chai, Xiaolin Wei, Liang Lin, Yizhou Yu
In this paper, we address a complex but practical scenario in semi-supervised learning (SSL) named open-set SSL, where unlabeled data contain both in-distribution (ID) and out-of-distribution (OOD) samples.
2 code implementations • 30 Jun 2023 • Yang Liu, Weixing Chen, Guanbin Li, Liang Lin
We present CausalVLR (Causal Visual-Linguistic Reasoning), an open-source toolbox containing a rich set of state-of-the-art causal relation discovery and causal inference methods for various visual-linguistic reasoning tasks, such as VQA, image/video captioning, medical report generation, model generalization and robustness, etc.
1 code implementation • 23 Jun 2023 • Jingyu Zhuang, Chen Wang, Lingjie Liu, Liang Lin, Guanbin Li
Neural fields have achieved impressive advancements in view synthesis and scene reconstruction.
1 code implementation • 13 Jun 2023 • Junfan Lin, Yuying Zhu, Lingbo Liu, Yang Liu, Guanbin Li, Liang Lin
1) The travel time of a vehicle is delayed feedback on the effectiveness of TSC policy at each traffic intersection since it is obtained after the vehicle has left the road network.
no code implementations • CVPR 2023 • Ricong Huang, Peiwen Lai, Yipeng Qin, Guanbin Li
In this work, we break these trade-offs with our novel parametric implicit face representation and propose a novel audio-driven facial reenactment framework that is both controllable and can generate high-quality talking heads.
no code implementations • 6 Jun 2023 • Yuncheng Jiang, Zixun Zhang, Ruimao Zhang, Guanbin Li, Shuguang Cui, Zhen Li
YONA fully exploits the information of one previous adjacent frame and conducts polyp detection on the current frame without multi-frame collaborations.
no code implementations • 30 May 2023 • Yang Zhang, Lingbo Liu, Xinyu Xiong, Guanbin Li, Guoli Wang, Liang Lin
In this work, we propose a novel end-to-end wind power forecasting model named Hierarchical Spatial-Temporal Transformer Network (HSTTN) to address the long-term WPF problems.
1 code implementation • CVPR 2023 • Weizhi Zhong, Chaowei Fang, Yinqi Cai, Pengxu Wei, Gangming Zhao, Liang Lin, Guanbin Li
Prior landmark characteristics of the speaker's face are employed to make the generated landmarks coincide with the facial outline of the speaker.
2 code implementations • 7 May 2023 • Yushen Wei, Yang Liu, Hong Yan, Guanbin Li, Liang Lin
Our VCSR involves two essential modules: i) the Question-Guided Refiner (QGR) module, which refines consecutive video frames guided by the question semantics to obtain more representative segment features for causal front-door intervention; ii) the Causal Scene Separator (CSS) module, which discovers a collection of visual causal and non-causal scenes based on the visual-linguistic causal relevance and estimates the causal effect of the scene-separating intervention in a contrastive learning manner.
1 code implementation • CVPR 2023 • Yushuang Wu, Zizheng Yan, Ce Chen, Lai Wei, Xiao Li, Guanbin Li, Yihao Li, Shuguang Cui, Xiaoguang Han
Thus, we propose a new task, SCoDA, for the domain adaptation of real scan shape completion from synthetic data.
no code implementations • 17 Mar 2023 • Kuo Wang, Lingbo Liu, Yang Liu, Guanbin Li, Fan Zhou, Liang Lin
The prediction of traffic flow is a challenging yet crucial problem in spatial-temporal analysis, which has recently gained increasing interest.
2 code implementations • 16 Mar 2023 • Weixing Chen, Yang Liu, Ce Wang, Jiarui Zhu, Shen Zhao, Guanbin Li, Cheng-Lin Liu, Liang Lin
Medical report generation (MRG) is essential for computer-aided diagnosis and medication guidance, which can relieve the heavy burden of radiologists by automatically generating the corresponding medical reports according to the given radiology image.
no code implementations • 22 Feb 2023 • Wei Lou, Xiang Wan, Guanbin Li, Xiaoying Lou, Chenghang Li, Feng Gao, Haofeng Li
Next, we convert a histopathology image into a graph structure with nuclei as nodes, and build a graph neural network to embed the spatial distribution of nuclei into their representations.
1 code implementation • ICCV 2023 • Zhihong Chen, Shizhe Diao, Benyou Wang, Guanbin Li, Xiang Wan
Medical vision-and-language pre-training (Med-VLP) has shown promising improvements on many downstream medical tasks owing to its applicability to extracting generic representations from medical images and texts.
1 code implementation • 12 Jan 2023 • Ruifei Zhang, Guanbin Li, Zhen Li, Shuguang Cui, Dahong Qian, Yizhou Yu
To tackle these issues, we propose an adaptive context selection based encoder-decoder framework which is composed of Local Context Attention (LCA) module, Global Context Module (GCM) and Adaptive Selection Module (ASM).
1 code implementation • 12 Jan 2023 • Ruifei Zhang, Sishuo Liu, Yizhou Yu, Guanbin Li
Since the two tasks rely on similar feature information, the unlabeled data effectively enhances the representation of the network to the lesion regions and further improves the segmentation performance.
1 code implementation • 12 Jan 2023 • Ruifei Zhang, Peiwen Lai, Xiang Wan, De-Jun Fan, Feng Gao, Xiao-Jian Wu, Guanbin Li
Automatic and accurate polyp segmentation plays an essential role in early colorectal cancer diagnosis.
no code implementations • ICCV 2023 • Jie Ma, Chuan Wang, Yang Liu, Liang Lin, Guanbin Li
As a mainstream framework in the field of semi-supervised learning (SSL), self-training via pseudo labeling and its variants have witnessed impressive progress in semi-supervised semantic segmentation with the recent advance of deep neural networks.
no code implementations • ICCV 2023 • Ziyi Zhang, Weikai Chen, Chaowei Fang, Zhen Li, Lechao Chen, Liang Lin, Guanbin Li
Confidence-wise, we propose a novel sample selection strategy based on confidence representation voting instead of the widely-used small-loss criterion.
1 code implementation • 20 Dec 2022 • Wei Lou, Haofeng Li, Guanbin Li, Xiaoguang Han, Xiang Wan
Recently deep neural networks, which require a large amount of annotated samples, have been widely applied in nuclei instance segmentation of H\&E stained pathology images.
1 code implementation • 7 Dec 2022 • Jun Wei, Yiwen Hu, Guanbin Li, Shuguang Cui, S Kevin Zhou, Zhen Li
In practice, box annotations are applied to alleviate the over-fitting issue of previous polyp segmentation models, which generate fine-grained polyp area through the iterative boosted segmentation model.
no code implementations • 2 Dec 2022 • Lechao Cheng, Chaowei Fang, Dingwen Zhang, Guanbin Li, Gang Huang
It can model the feature space more comprehensively and reduce the dominance of head classes.
2 code implementations • 12 Nov 2022 • Ziyi Zhang, Weikai Chen, Hui Cheng, Zhen Li, Siyuan Li, Liang Lin, Guanbin Li
We investigate a practical domain adaptation task, called source-free domain adaptation (SFUDA), where the source-pretrained model is adapted to the target domain without access to the source data.
Ranked #4 on Source-Free Domain Adaptation on VisDA-2017
1 code implementation • CVPR 2023 • Junfan Lin, Jianlong Chang, Lingbo Liu, Guanbin Li, Liang Lin, Qi Tian, Chang Wen Chen
During inference, instead of changing the motion generator, our method reformulates the input text into a masked motion as the prompt for the motion generator to ``reconstruct'' the motion.
1 code implementation • 20 Sep 2022 • Haofeng Li, Junjia Huang, Guanbin Li, Zhou Liu, Yihong Zhong, Yingying Chen, Yunfei Wang, Xiang Wan
Deep neural networks (DNNs) have been widely adopted in brain lesion detection and segmentation.
1 code implementation • 19 Sep 2022 • Junjia Huang, Haofeng Li, Guanbin Li, Xiang Wan
Self-supervised learning methods based on image patch reconstruction have witnessed great success in training auto-encoders, whose pre-trained weights can be transferred to fine-tune other downstream tasks of image understanding.
1 code implementation • 15 Sep 2022 • Zhihong Chen, Yuhao Du, Jinpeng Hu, Yang Liu, Guanbin Li, Xiang Wan, Tsung-Hui Chang
Besides, we conduct further analysis to better verify the effectiveness of different components of our approach and various settings of pre-training.
1 code implementation • 15 Sep 2022 • Zhihong Chen, Guanbin Li, Xiang Wan
Most existing methods mainly contain three elements: uni-modal encoders (i. e., a vision encoder and a language encoder), a multi-modal fusion module, and pretext tasks, with few studies considering the importance of medical domain expert knowledge and explicitly exploiting such knowledge to facilitate Med-VLP.
1 code implementation • 5 Aug 2022 • Jichang Li, Guanbin Li, Feng Liu, Yizhou Yu
Specifically, our method is divided into two steps: 1) Neighborhood Collective Noise Verification to separate all training samples into a clean or noisy subset, 2) Neighborhood Collective Label Correction to relabel noisy samples, and then auxiliary techniques are used to assist further model optimization.
1 code implementation • 31 Jul 2022 • Jiutao Yue, Haofeng Li, Pengxu Wei, Guanbin Li, Liang Lin
Since the frequency masking may not only destroys the adversarial perturbations but also affects the sharp details in a clean image, we further develop an adversarial sample classifier based on the frequency domain of images to determine if applying the proposed mask module.
1 code implementation • 29 Jul 2022 • Ganlong Zhao, Guanbin Li, Yipeng Qin, Feng Liu, Yizhou Yu
In this paper, we propose a two-stage clean samples identification method to address the aforementioned challenge.
Ranked #3 on Image Classification on Clothing1M (using extra training data)
2 code implementations • 26 Jul 2022 • Yang Liu, Guanbin Li, Liang Lin
Existing visual question answering methods often suffer from cross-modal spurious correlations and oversimplified event-level reasoning processes that fail to capture event temporality, causality, and dynamics spanning over the video.
1 code implementation • 2 Jul 2022 • Haifan Gong, Hui Cheng, Yifan Xie, Shuangyi Tan, Guanqi Chen, Fei Chen, Guanbin Li
Thyroid nodule classification aims at determining whether the nodule is benign or malignant based on a given ultrasound image.
no code implementations • 14 May 2022 • Wenhao Huang, Haifan Gong, huan zhang, Yu Wang, Haofeng Li, Guanbin Li, Hong Shen
CT-based bronchial tree analysis plays an important role in the computer-aided diagnosis for respiratory diseases, as it could provide structured information for clinicians.
1 code implementation • 9 May 2022 • Zizheng Yan, Yushuang Wu, Guanbin Li, Yipeng Qin, Xiaoguang Han, Shuguang Cui
Semi-supervised domain adaptation (SSDA) aims to apply knowledge learned from a fully labeled source domain to a scarcely labeled target domain.
Ranked #1 on Semi-supervised Domain Adaptation on VisDA2017
1 code implementation • CVPR 2022 • Xiaoqian Xu, Pengxu Wei, Weikai Chen, Mingzhi Mao, Liang Lin, Guanbin Li
To address this issue, we propose an unsupervised domain adaptation mechanism for real-world SR, named Dual ADversarial Adaptation (DADA), which only requires LR images in the target domain with available real paired data from a source camera.
no code implementations • 26 Apr 2022 • Yang Liu, Yushen Wei, Hong Yan, Guanbin Li, Liang Lin
Visual representation learning is ubiquitous in various real-world applications, including visual comprehension, video understanding, multi-modal analysis, human-computer interaction, and urban computing.
no code implementations • 7 Mar 2022 • Jingyu Zhuang, Ziliang Chen, Pengxu Wei, Guanbin Li, Liang Lin
In Open Set Domain Adaptation (OSDA), large amounts of target samples are drawn from the implicit categories that never appear in the source domain.
1 code implementation • CVPR 2022 • Zhihao Yuan, Xu Yan, Yinghong Liao, Yao Guo, Guanbin Li, Zhen Li, Shuguang Cui
Thus, a more faithful caption can be generated only using point clouds during the inference.
1 code implementation • 26 Feb 2022 • Pengxiang Yan, Ziyi Wu, Mengmeng Liu, Kun Zeng, Liang Lin, Guanbin Li
To relieve the burden of labor-intensive labeling, deep unsupervised SOD methods have been proposed to exploit noisy labels generated by handcrafted saliency methods.
no code implementations • 22 Feb 2022 • Yushuang Wu, Zizheng Yan, Shengcai Cai, Guanbin Li, Yizhou Yu, Xiaoguang Han, Shuguang Cui
Semantic segmentation of point cloud usually relies on dense annotation that is exhausting and costly, so it attracts wide attention to investigate solutions for the weakly supervised scheme with only sparse points annotated.
Representation Learning Weakly supervised Semantic Segmentation +1
1 code implementation • 8 Feb 2022 • Xinkai Zhao, Chaowei Fang, De-Jun Fan, Xutao Lin, Feng Gao, Guanbin Li
Semi-supervised learning (SSL), which aims at leveraging a few labeled images and a large number of unlabeled images for network training, is beneficial for relieving the burden of data annotation in medical image segmentation.
no code implementations • 30 Nov 2021 • Lingbo Liu, Zewei Yang, Guanbin Li, Kuo Wang, Tianshui Chen, Liang Lin
Land remote sensing analysis is a crucial research in earth science.
no code implementations • 16 Oct 2021 • Yang Wu, Shirui Feng, Guanbin Li, Liang Lin
PEMR includes a "looking ahead" process, \textit{i. e.} a visual feature extractor module that estimates feasible paths for gathering 3D navigational information, which is mimicking the human sense of direction.
1 code implementation • 29 Sep 2021 • Lingbo Liu, Mengmeng Liu, Guanbin Li, Ziyi Wu, Junfan Lin, Liang Lin
Furthermore, we take the road network feature as a query to capture the long-range spatial distribution of traffic flow with a transformer architecture.
no code implementations • ICCV 2021 • Junkai Huang, Chaowei Fang, Weikai Chen, Zhenhua Chai, Xiaolin Wei, Pengxu Wei, Liang Lin, Guanbin Li
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data.
1 code implementation • ICCV 2021 • Zunzhi You, Yi-Hsuan Tsai, Wei-Chen Chiu, Guanbin Li
Based on our observations, we quantify the interpretability of a deep MDE network by the depth selectivity of its hidden units.
no code implementations • 9 Aug 2021 • Jie Wu, Wei zhang, Guanbin Li, Wenhao Wu, Xiao Tan, YingYing Li, Errui Ding, Liang Lin
In this paper, we introduce a novel task, referred to as Weakly-Supervised Spatio-Temporal Anomaly Detection (WSSTAD) in surveillance video.
no code implementations • 5 Aug 2021 • Qin Wang, Hui Che, Weizhen Ding, Li Xiang, Guanbin Li, Zhen Li, Shuguang Cui
Thus, we propose a novel framework based on a teacher-student architecture for the accurate colorectal polyp classification (CPC) through directly using white-light (WL) colonoscopy images in the examination.
1 code implementation • 2 Jul 2021 • Lingbo Liu, Yuying Zhu, Guanbin Li, Ziyi Wu, Lei Bai, Liang Lin
In this work, we proposed a novel neural network module termed Heterogeneous Information Aggregation Machine (HIAM), which fully exploits heterogeneous information of historical data (e. g., incomplete OD matrices, unfinished order vectors, and DO matrices) to jointly learn the evolutionary patterns of OD and DO ridership.
1 code implementation • CVPR 2021 • Sibei Yang, Meng Xia, Guanbin Li, Hong-Yu Zhou, Yizhou Yu
In this paper, we tackle the challenge by jointly performing compositional visual reasoning and accurate segmentation in a single stage via the proposed novel Bottom-Up Shift (BUS) and Bidirectional Attentive Refinement (BIAR) modules.
1 code implementation • 15 May 2021 • Si Liu, Tianrui Hui, Shaofei Huang, Yunchao Wei, Bo Li, Guanbin Li
In this paper, we propose a Cross-Modal Progressive Comprehension (CMPC) scheme to effectively mimic human behaviors and implement it as a CMPC-I (Image) module and a CMPC-V (Video) module to improve referring image and video segmentation models.
Ranked #7 on Referring Expression Segmentation on J-HMDB
no code implementations • CVPR 2021 • Tianrui Hui, Shaofei Huang, Si Liu, Zihan Ding, Guanbin Li, Wenguan Wang, Jizhong Han, Fei Wang
Though 3D convolutions are amenable to recognizing which actor is performing the queried actions, it also inevitably introduces misaligned spatial information from adjacent frames, which confuses features of the target frame and yields inaccurate segmentation.
Ranked #8 on Referring Expression Segmentation on J-HMDB
2 code implementations • CVPR 2021 • Jichang Li, Guanbin Li, Yemin Shi, Yizhou Yu
Pseudo labeling expands the number of ``labeled" samples in each class in the target domain, and thus produces a more robust and powerful cluster core for each class to facilitate adversarial learning.
no code implementations • 7 Apr 2021 • Xinkai Zhao, Chaowei Fang, Feng Gao, De-Jun Fan, Xutao Lin, Guanbin Li
In this paper, we propose a deep model to ground shooting range of small intestine from a capsule endoscope video which has duration of tens of hours.
no code implementations • CVPR 2021 • Xiangru Lin, Guanbin Li, Yizhou Yu
Intuitively, we comprehend the semantics of the instruction to form an overview of where a bathroom is and what a blue towel is in mind; then, we navigate to the target location by consistently matching the bathroom appearance in mind with the current scene.
no code implementations • ICCV 2021 • Yushuang Wu, Zizheng Yan, Xiaoguang Han, Guanbin Li, Changqing Zou, Shuguang Cui
The key point of language-guided person search is to construct the cross-modal association between visual and textual input.
no code implementations • 1 Jan 2021 • Hongjun Wang, Guanbin Li, Liang Lin
To protect the security of machine learning models against adversarial examples, adversarial training becomes the most popular and powerful strategy against various adversarial attacks by injecting adversarial examples into training data.
1 code implementation • CVPR 2021 • Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin
Extensive experiments conducted on the RGBT-CC benchmark demonstrate the effectiveness of our framework for RGBT crowd counting.
1 code implementation • 10 Nov 2020 • Zongheng Tang, Yue Liao, Si Liu, Guanbin Li, Xiaojie Jin, Hongxu Jiang, Qian Yu, Dong Xu
HC-STVG is a video grounding task that requires both spatial (where) and temporal (when) localization.
no code implementations • 15 Oct 2020 • Hongjun Wang, Guanbin Li, Xiaobai Liu, Liang Lin
Although deep convolutional neural networks (CNNs) have demonstrated remarkable performance on multiple computer vision tasks, researches on adversarial learning have shown that deep models are vulnerable to adversarial examples, which are crafted by adding visually imperceptible perturbations to the input images.
no code implementations • 9 Oct 2020 • Gangming Zhao, Chaowei Fang, Guanbin Li, Licheng Jiao, Yizhou Yu
Aimed at improving the performance of existing detection methods, we propose a deep end-to-end module to exploit the contralateral context information for enhancing feature representations of disease proposals.
1 code implementation • ECCV 2020 • Tianrui Hui, Si Liu, Shaofei Huang, Guanbin Li, Sansi Yu, Faxi Zhang, Jizhong Han
Referring image segmentation aims to predict the foreground mask of the object referred by a natural language sentence.
1 code implementation • CVPR 2020 • Shaofei Huang, Tianrui Hui, Si Liu, Guanbin Li, Yunchao Wei, Jizhong Han, Luoqi Liu, Bo Li
In addition to the CMPC module, we further leverage a simple yet effective TGFE module to integrate the reasoned multimodal features from different levels with the guidance of textual information.
Ranked #14 on Referring Expression Segmentation on RefCOCO testB
no code implementations • 18 Sep 2020 • Jie Wu, Guanbin Li, Xiaoguang Han, Liang Lin
Temporal grounding of natural language in untrimmed videos is a fundamental yet challenging multimedia task facilitating cross-media visual content retrieval.
no code implementations • 17 Sep 2020 • Haofeng Li, Yirui Zeng, Guanbin Li, Liang Lin, Yizhou Yu
The field of computer vision has witnessed phenomenal progress in recent years partially due to the development of deep convolutional neural networks.
1 code implementation • ECCV 2020 • Ganlong Zhao, Guanbin Li, Ruijia Xu, Liang Lin
Domain adaptation for object detection tries to adapt the detector from labeled datasets to unlabeled ones for better performance.
1 code implementation • 1 Sep 2020 • Yang Liu, Keze Wang, Guanbin Li, Liang Lin
In this paper, we propose a novel framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos) by adaptively transferring and distilling the knowledge from multiple wearable sensors.
1 code implementation • CVPR 2020 • Sibei Yang, Guanbin Li, Yizhou Yu
The linguistic structure of a referring expression provides a layout of reasoning over the visual contents, and it is often crucial to align and jointly understand the image and the referring expression.
1 code implementation • ECCV 2020 • Lingteng Qiu, Xuanye Zhang, Yan-ran Li, Guanbin Li, Xiao-Jun Wu, Zixiang Xiong, Xiaoguang Han, Shuguang Cui
Although occlusion widely exists in nature and remains a fundamental challenge for pose estimation, existing heatmap-based approaches suffer serious degradation on occlusions.
2 code implementations • 23 Mar 2020 • Lingbo Liu, Jiaqi Chen, Hefeng Wu, Tianshui Chen, Guanbin Li, Liang Lin
Crowd counting is an application-oriented task and its inference efficiency is crucial for real-world applications.
no code implementations • 22 Jan 2020 • Haofeng Li, Guanbin Li, BinBin Yang, Guanqi Chen, Liang Lin, Yizhou Yu
The proposed algorithm for the first time achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
1 code implementation • 18 Jan 2020 • Jie Wu, Guanbin Li, Si Liu, Liang Lin
Temporally language grounding in untrimmed videos is a newly-raised task in video understanding.
2 code implementations • 14 Jan 2020 • Lingbo Liu, Jingwen Chen, Hefeng Wu, Jiajie Zhen, Guanbin Li, Liang Lin
To address this problem, we model a metro system as graphs with various topologies and propose a unified Physical-Virtual Collaboration Graph Network (PVCGN), which can effectively learn the complex ridership patterns from the tailor-designed graphs.
no code implementations • 18 Dec 2019 • Jihan Yang, Ruijia Xu, Ruiyu Li, Xiaojuan Qi, Xiaoyong Shen, Guanbin Li, Liang Lin
In contrast to adversarial alignment, we propose to explicitly train a domain-invariant classifier by generating and defensing against pointwise feature space adversarial perturbations.
no code implementations • 23 Nov 2019 • Chaowei Fang, Guanbin Li, Xiaoguang Han, Yizhou Yu
It further recurrently exploits the reconstructed results and intermediate features of a sequence of preceding frames to improve the initial super-resolution of the current frame by modelling the coherence of structural facial features across frames.
no code implementations • 23 Nov 2019 • Chaowei Fang, Guanbin Li, Chengwei Pan, Yiming Li, Yizhou Yu
Recently 3D volumetric organ segmentation attracts much research interest in medical image analysis due to its significance in computer aided diagnosis.
1 code implementation • 21 Nov 2019 • Riquan Chen, Tianshui Chen, Xiaolu Hui, Hefeng Wu, Guanbin Li, Liang Lin
In this work, we represent the semantic correlations in the form of structured knowledge graph and integrate this graph into deep neural networks to promote few-shot learning by a novel Knowledge Graph Transfer Network (KGTN).
no code implementations • 25 Sep 2019 • Defa Zhu, Si Liu, Wentao Jiang, Guanbin Li, Tianyi Wu, Guodong Guo
Visual relationship recognition models are limited in the ability to generalize from finite seen predicates to unseen ones.
no code implementations • ICCV 2019 • Sibei Yang, Guanbin Li, Yizhou Yu
In this paper, we explore the problem of referring expression comprehension from the perspective of language-driven visual reasoning, and propose a dynamic graph attention network to perform multi-step reasoning by modeling both the relationships among the objects in the image and the linguistic structure of the expression.
2 code implementations • ICCV 2019 • Haofeng Li, Guanqi Chen, Guanbin Li, Yizhou Yu
In this paper, we develop a multi-task motion guided video salient object detection network, which learns to accomplish two sub-tasks using two sub-networks, one sub-network for salient object detection in still images and the other for motion saliency detection in optical flow images.
no code implementations • CVPR 2020 • Yue Liao, Si Liu, Guanbin Li, Fei Wang, Yanjie Chen, Chen Qian, Bo Li
RCCF reformulates the referring expression comprehension as a correlation filtering process.
2 code implementations • 2 Sep 2019 • Lingbo Liu, Jiajie Zhen, Guanbin Li, Geng Zhan, Zhaocheng He, Bowen Du, Liang Lin
Specifically, the first ConvLSTM unit takes normal traffic flow features as input and generates a hidden state at each time-step, which is further fed into the connected convolutional layer for spatial attention map inference.
no code implementations • ICCV 2019 • Zhanghui Kuang, Yiming Gao, Guanbin Li, Ping Luo, Yimin Chen, Liang Lin, Wayne Zhang
To address this issue, we propose a novel Graph Reasoning Network (GRNet) on a Similarity Pyramid, which learns similarities between a query and a gallery cloth by using both global and local representations in multiple scales.
Ranked #4 on Image Retrieval on DeepFashion - Consumer-to-shop (Rank-1 metric)
no code implementations • ICCV 2019 • Lingbo Liu, Zhilin Qiu, Guanbin Li, Shufan Liu, Wanli Ouyang, Liang Lin
Automatic estimation of the number of people in unconstrained crowded scenes is a challenging task and one major difficulty stems from the huge scale variation of people.
1 code implementation • ICCV 2019 • Pengxiang Yan, Guanbin Li, Yuan Xie, Zhen Li, Chuan Wang, Tianshui Chen, Liang Lin
Specifically, we present an effective video saliency detector that consists of a spatial refinement network and a spatiotemporal module.
Ranked #1 on Video Salient Object Detection on VOS-T (using extra training data)
no code implementations • ICCV 2019 • Yi He, Jiayuan Shi, Chuan Wang, Haibin Huang, Jiaming Liu, Guanbin Li, Risheng Liu, Jue Wang
In this paper we present a new data-driven method for robust skin detection from a single human portrait image.
1 code implementation • 8 Jul 2019 • Ziliang Chen, Zhanfu Yang, Xiaoxi Wang, Xiaodan Liang, Xiaopeng Yan, Guanbin Li, Liang Lin
A broad range of cross-$m$-domain generation researches boil down to matching a joint distribution by deep generative models (DGMs).
1 code implementation • CVPR 2019 • Sibei Yang, Guanbin Li, Yizhou Yu
Unfortunately, existing work on grounding referring expressions fails to accurately extract multi-order relationships from the referring expression and associate them with the objects and their related contexts in the image.
no code implementations • 15 May 2019 • Lingbo Liu, Zhilin Qiu, Guanbin Li, Qing Wang, Wanli Ouyang, Liang Lin
Finally, a GCC module is applied to model the correlation between all regions by computing a global correlation feature as a weighted sum of all regional features, with the weights being calculated as the similarity between the corresponding region pairs.
no code implementations • 9 May 2019 • Haofeng Li, Guanbin Li, Yizhou Yu
To our knowledge, this paper is the first one that mounts successful adversarial attacks on salient object detection models and verifies that adversarial samples are effective on a wide range of existing methods.
no code implementations • 4 May 2019 • Yukai Shi, Guanbin Li, Qingxing Cao, Keze Wang, Liang Lin
Face hallucination is a domain-specific super-resolution problem that aims to generate a high-resolution (HR) face image from a low-resolution~(LR) input.
no code implementations • 22 Apr 2019 • Guanbin Li, Xin Zhu, Yirui Zeng, Qing Wang, Liang Lin
Specifically, by analyzing the symbiosis and mutual exclusion of AUs in various facial expressions, we organize the facial AUs in the form of structured knowledge-graph and integrate a Gated Graph Neural Network (GGNN) in a multi-scale CNN framework to propagate node information through the graph for generating enhanced AU representation.
no code implementations • 1 Apr 2019 • Kan Wu, Guanbin Li, Haofeng Li, Jianjun Zhang, Yizhou Yu
As a concrete example, a database of over 1. 2 million visual objects has been built using the proposed method, and has been successfully used in various data-driven image applications.
no code implementations • 11 Dec 2018 • Weikai Chen, Xiaoguang Han, Guanbin Li, Chao Chen, Jun Xing, Yajie Zhao, Hao Li
Three-dimensional object recognition has recently achieved great progress thanks to the development of effective point cloud-based learning frameworks, such as PointNet and its extensions.
no code implementations • 10 Dec 2018 • Lingbo Liu, Guanbin Li, Yuan Xie, Yizhou Yu, Qing Wang, Liang Lin
In this paper, we propose a novel cascaded backbone-branches fully convolutional neural network~(BB-FCN) for rapidly and accurately localizing facial landmarks in unconstrained and cluttered settings.
no code implementations • 4 Dec 2018 • Xu Cai, Yang Wu, Guanbin Li, Ziliang Chen, Liang Lin
FRAME (Filters, Random fields, And Maximum Entropy) is an energy-based descriptive model that synthesizes visual realism by capturing mutual patterns from structural input signals.
3 code implementations • ICCV 2019 • Ruijia Xu, Guanbin Li, Jihan Yang, Liang Lin
Domain adaptation enables the learner to safely generalize into novel environments by mitigating domain shifts across distributions.
Ranked #7 on Domain Adaptation on ImageCLEF-DA
no code implementations • 30 Oct 2018 • Guanbin Li, Yukang Gan, Hejun Wu, Nong Xiao, Liang Lin
In this paper, we address this problem by developing a Cross-Modal Attentional Context (CMAC) learning framework, which enables the full exploitation of the context information from both RGB and depth data.
no code implementations • 10 Oct 2018 • Lili Huang, Jiefeng Peng, Ruimao Zhang, Guanbin Li, Liang Lin
Semantic image parsing, which refers to the process of decomposing images into semantic regions and constructing the structure representation of the input, has recently aroused widespread interest in the field of computer vision.
no code implementations • 1 Sep 2018 • Lingbo Liu, Ruimao Zhang, Jiefeng Peng, Guanbin Li, Bowen Du, Liang Lin
Traffic flow prediction is crucial for urban traffic management and public safety.
no code implementations • 4 Aug 2018 • Guanbin Li, Xiang He, Wei zhang, Huiyou Chang, Le Dong, Liang Lin
Single image rain streaks removal has recently witnessed substantial progress due to the development of deep convolutional neural networks.
no code implementations • 2 Jul 2018 • Lingbo Liu, Hongjun Wang, Guanbin Li, Wanli Ouyang, Liang Lin
Crowd counting from unconstrained scene images is a crucial task in many real-world applications like urban surveillance and management, but it is greatly challenged by the camera's perspective that causes huge appearance variations in people's scales and rotations.
no code implementations • CVPR 2018 • Xian Wu, Guanbin Li, Qingxing Cao, Qingge Ji, Liang Lin
Automatically describing open-domain videos with natural language are attracting increasing interest in the field of artificial intelligence.
no code implementations • CVPR 2018 • Guanbin Li, Yuan Xie, Tianhao Wei, Keze Wang, Liang Lin
Image saliency detection has recently witnessed significant progress due to deep convolutional neural networks.
Ranked #2 on Video Salient Object Detection on DAVSOD-Difficult20 (using extra training data)
no code implementations • CVPR 2018 • Qingxing Cao, Xiaodan Liang, Bailing Li, Guanbin Li, Liang Lin
This network comprises of two collaborative modules: i) an adversarial attention module to exploit the local visual evidence for each word parsed from the question; ii) a residual composition module to compose the previously mined evidence.
no code implementations • 30 Mar 2018 • Guanbin Li, Yizhou Yu
In this paper, we develop hybrid contrast-oriented deep neural networks to overcome the aforementioned limitations.
no code implementations • 17 Mar 2018 • Guanbin Li, Yuan Xie, Liang Lin
Our algorithm is based on alternately exploiting a graphical model and training a fully convolutional network for model updating.
no code implementations • 21 Dec 2017 • Haofeng Li, Guanbin Li, Liang Lin, Yizhou Yu
Our proposed GAN-based framework consists of a fully convolutional design for the generator which helps to better preserve spatial structures and a joint loss function with a revised perceptual loss to capture high-level semantics in the context.
no code implementations • 20 Dec 2017 • Tianshui Chen, Zhouxia Wang, Guanbin Li, Liang Lin
Recognizing multiple labels of images is a fundamental but challenging task in computer vision, and remarkable progress has been attained by localizing semantic-aware image regions and predicting their labels with deep convolutional neural networks.
no code implementations • ICCV 2017 • Zhouxia Wang, Tianshui Chen, Guanbin Li, Ruijia Xu, Liang Lin
This paper proposes a novel deep architecture to address multi-label image recognition, a fundamental and practical task towards general visual understanding.
no code implementations • CVPR 2017 • Qingxing Cao, Liang Lin, Yukai Shi, Xiaodan Liang, Guanbin Li
Face hallucination is a domain-specific super-resolution problem with the goal to generate high-resolution (HR) faces from low-resolution (LR) input images.
no code implementations • CVPR 2017 • Guanbin Li, Yuan Xie, Liang Lin, Yizhou Yu
Image saliency detection has recently witnessed rapid progress due to deep convolutional neural networks.
Ranked #15 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)
2 code implementations • 7 Sep 2016 • Guanbin Li, Yizhou Yu
The penultimate layer of our neural network has been confirmed to be a discriminative high-level feature vector for saliency detection, which we call deep contrast feature.
no code implementations • CVPR 2016 • Guanbin Li, Yizhou Yu
Our deep network consists of two complementary components, a pixel-level fully convolutional stream and a segment-wise spatial pooling stream.
Ranked #19 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)
no code implementations • CVPR 2015 • Guanbin Li, Yizhou Yu
Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision.