no code implementations • 15 May 2024 • Jiaxing Yang, Lihe Zhang, Jiayu Sun, Huchuan Lu
In this paper, we propose Spatial Semantic Recurrent Mining (S\textsuperscript{2}RM) to achieve high-quality cross-modality fusion.
1 code implementation • 2 May 2024 • Xiaoqi Zhao, Youwei Pang, Wei Ji, Baicheng Sheng, Jiaming Zuo, Lihe Zhang, Huchuan Lu
Different from the context-independent (CI) concepts such as human, car, and airplane, context-dependent (CD) concepts require higher visual understanding ability, such as camouflaged object and medical lesion.
2 code implementations • 11 Apr 2024 • Qian Yu, Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu
Dichotomous Image Segmentation (DIS) has recently emerged towards high-precision object segmentation from high-resolution natural images.
Ranked #1 on Dichotomous Image Segmentation on DIS-VD
no code implementations • 28 Feb 2024 • Mengnan Zhao, Lihe Zhang, Yuqiu Kong, BaoCai Yin
To tackle this issue, we initially employ the feature activation differences between clean and adversarial examples to analyze the underlying causes of CO. Intriguingly, our findings reveal that CO can be attributed to the feature coverage induced by a few specific pathways.
no code implementations • 3 Feb 2024 • Mengnan Zhao, Lihe Zhang, Tianhang Zheng, Yuqiu Kong, BaoCai Yin
Large-scale diffusion models, known for their impressive image generation capabilities, have raised concerns among researchers regarding social impacts, such as the imitation of copyrighted artistic styles.
no code implementations • 9 Dec 2023 • Mengnan Zhao, Lihe Zhang, Yuqiu Kong, BaoCai Yin
It enhances the initial instance positions through weighted farthest point sampling and further refines the instance positions and proposals using aggregation averaging and center matching.
1 code implementation • 5 Dec 2023 • Xiaoqi Zhao, Youwei Pang, Zhenyu Chen, Qian Yu, Lihe Zhang, Hanqi Liu, Jiaming Zuo, Huchuan Lu
We conduct a comprehensive study on a new task named power battery detection (PBD), which aims to localize the dense cathode and anode plates endpoints from X-ray images to evaluate the quality of power batteries.
no code implementations • 19 Nov 2023 • Youwei Pang, Xiaoqi Zhao, Jiaming Zuo, Lihe Zhang, Huchuan Lu
With the proposed dataset and baseline, we hope that this new task with more practical value can further expand the research on open-vocabulary dense prediction tasks.
1 code implementation • 31 Oct 2023 • Youwei Pang, Xiaoqi Zhao, Tian-Zhu Xiang, Lihe Zhang, Huchuan Lu
Apart from the high intrinsic similarity between camouflaged objects and their background, objects are usually diverse in scale, fuzzy in appearance, and even severely occluded.
Ranked #1 on Camouflaged Object Segmentation on Camouflaged Animal Dataset (using extra training data)
1 code implementation • ICCV 2023 • Fang Liu, Yuhao Liu, Yuqiu Kong, Ke Xu, Lihe Zhang, BaoCai Yin, Gerhard Hancke, Rynson Lau
Hence, we propose a novel weakly-supervised RIS framework to formulate the target localization problem as a classification process to differentiate between positive and negative text expressions.
1 code implementation • ICCV 2023 • Mengnan Zhao, Lihe Zhang, Yuqiu Kong, BaoCai Yin
To address this, we analyze the training process of prior FAT work and observe that catastrophic overfitting is accompanied by the appearance of loss convergence outliers.
1 code implementation • 23 Jul 2023 • Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu
Specifically, unlike existing methods that over-specialize in a single task or a subset of tasks, ComPtr starts from the more general concept of bi-source dense prediction.
Ranked #14 on Semantic Segmentation on NYU Depth v2
1 code implementation • 4 Jun 2023 • Shijie Chang, Zeqi Hao, Ben Kang, Xiaoqi Zhao, Jiawen Zhu, Zhenyu Chen, Lihe Zhang, Lu Zhang, Huchuan Lu
In this paper, we introduce 3rd place solution for PVUW2023 VSS track.
2 code implementations • 20 Mar 2023 • Xiaoqi Zhao, Hongpeng Jia, Youwei Pang, Long Lv, Feng Tian, Lihe Zhang, Weibing Sun, Huchuan Lu
Next, we expand the single-scale SU to the intra-layer multi-scale SU, which can provide the decoder with both pixel-level and structure-level difference information.
1 code implementation • 18 Mar 2023 • Xiaoqi Zhao, Shijie Chang, Youwei Pang, Jiaxing Yang, Lihe Zhang, Huchuan Lu
In the static object predictor, the RGB source is converted to depth and static saliency sources, simultaneously.
2 code implementations • 18 Mar 2023 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Lei Zhang
They ignore two key problems when the encoder exchanges information with the decoder: one is the lack of interference control mechanism between them, the other is without considering the disparity of the contributions from different encoder levels.
1 code implementation • ICCV 2023 • Jiayu Sun, Ke Xu, Youwei Pang, Lihe Zhang, Huchuan Lu, Gerhard Hancke, Rynson Lau
In this paper, we propose a novel method to detect shadows from raw images.
no code implementations • 30 Mar 2022 • Guang Feng, Lihe Zhang, Zhiwei Hu, Huchuan Lu
To address this task, we first design a two-stream encoder to extract CNN-based visual features and transformer-based linguistic features hierarchically, and a vision-language mutual guidance (VLMG) module is inserted into the encoder multiple times to promote the hierarchical and progressive fusion of multi-modal features.
Ranked #3 on Referring Expression Segmentation on J-HMDB
1 code implementation • 9 Mar 2022 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu
In this paper, we propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD).
no code implementations • 8 Mar 2022 • Jiaxing Yang, Lihe Zhang, Huchuan Lu
In this work, we propose Atrous Transformer (AtrousFormer) to solve the problem.
Ranked #25 on Lane Detection on CULane
1 code implementation • 4 Dec 2021 • Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu
Most of the existing bi-modal (RGB-D and RGB-T) salient object detection methods utilize the convolution operation and construct complex interweave fusion structures to achieve cross-modal information integration.
1 code implementation • 17 Oct 2021 • Mengnan Zhao, Lihe Zhang, Yuqiu Kong, BaoCai Yin
Specifically, the transient learning network considers transient memories as a static knowledge graph, and the time-aware recurrent evolution network learns representations through a sequence of recurrent evolution units from long-short-term memories.
1 code implementation • 24 Sep 2021 • Jiayu Sun, Zhanghan Ke, Lihe Zhang, Huchuan Lu, Rynson W. H. Lau
In this work, we observe that instead of asking the user to explicitly provide a background image, we may recover it from the input video itself.
1 code implementation • 11 Aug 2021 • Xiaoqi Zhao, Youwei Pang, Jiaxing Yang, Lihe Zhang, Huchuan Lu
In this paper, we propose a novel multi-source fusion network for zero-shot video object segmentation.
Ranked #1 on Video Object Segmentation on FBMS (Jaccard (Mean) metric)
2 code implementations • 11 Aug 2021 • Xiaoqi Zhao, Lihe Zhang, Huchuan Lu
\keywords{Colorectal Cancer \and Automatic Polyp Segmentation \and Subtraction \and LossNet.}
no code implementations • CVPR 2021 • Guang Feng, Zhiwei Hu, Lihe Zhang, Huchuan Lu
In this work, we propose an encoder fusion network (EFN), which transforms the visual encoder into a multi-modal feature learning network, and uses language to refine the multi-modal features progressively.
1 code implementation • 29 Jan 2021 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Xiang Ruan
Existing CNNs-Based RGB-D salient object detection (SOD) networks are all required to be pretrained on the ImageNet to learn the hierarchy features which helps provide a good initialization.
1 code implementation • CVPR 2020 • Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu
To obtain more efficient multi-scale features from the integrated features, the self-interaction modules are embedded in each decoder unit.
3 code implementations • ECCV 2020 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Lei Zhang
With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder.
Ranked #15 on Dichotomous Image Segmentation on DIS-TE4
1 code implementation • ECCV 2020 • Xiaoqi Zhao, Lihe Zhang, Youwei Pang, Huchuan Lu, Lei Zhang
In this work, we design a single stream network to directly use the depth map to guide early fusion and middle fusion between RGB and depth, which saves the feature encoder of the depth stream and achieves a lightweight and real-time model.
Ranked #15 on Thermal Image Segmentation on RGB-T-Glass-Segmentation
1 code implementation • ECCV 2020 • Youwei Pang, Lihe Zhang, Xiaoqi Zhao, Huchuan Lu
The main purpose of RGB-D salient object detection (SOD) is how to better integrate and utilize cross-modal fusion information.
Ranked #5 on RGB-D Salient Object Detection on NJU2K
1 code implementation • 24 Dec 2019 • Yanxing Wang, Jianxing Hu, Junyong Lai, Yibo Li, Hongwei Jin, Lihe Zhang, Liangren Zhang, Zhenming Liu
Molecular fingerprints are the workhorse in ligand-based drug discovery.
1 code implementation • ICCV 2019 • Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang
SSNet consists of a segmentation network (SN) and a saliency aggregation module (SAM).
1 code implementation • CVPR 2019 • Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang, Mingyang Qian, Yizhou Yu
To this end, we propose a unified framework to train saliency detection models with diverse weak supervision sources.
1 code implementation • CVPR 2018 • Yu Zeng, Huchuan Lu, Lihe Zhang, Mengyang Feng, Ali Borji
The categories and appearance of salient objects vary from image to image, therefore, saliency detection is an image-specific task.
no code implementations • CVPR 2018 • Tiantian Wang, Lihe Zhang, Shuo Wang, Huchuan Lu, Gang Yang, Xiang Ruan, Ali Borji
Moreover, to effectively recover object boundaries, we propose a local Boundary Refinement Network (BRN) to adaptively learn the local contextual information for each spatial position.
Ranked #13 on RGB Salient Object Detection on DUTS-TE
1 code implementation • ICCV 2017 • Tiantian Wang, Ali Borji, Lihe Zhang, Pingping Zhang, Huchuan Lu
To remedy this problem, here we propose to augment feedforward neural networks with a novel pyramid pooling module and a multi-stage refinement mechanism for saliency detection.
Ranked #14 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)
no code implementations • CVPR 2013 • Chuan Yang, Lihe Zhang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang
The saliency of the image elements is defined based on their relevances to the given seeds or queries.