1 code implementation • 25 Mar 2024 • Ye Li, Lingdong Kong, Hanjiang Hu, Xiaohao Xu, Xiaonan Huang
The robustness of driving perception systems under unprecedented conditions is crucial for safety-critical usages.
1 code implementation • 17 Mar 2024 • Xiaohao Xu, Yunkang Cao, Yongqi Chen, Weiming Shen, Xiaonan Huang
In addition, we unify the input representation of multi-modality into a 2D image format, enabling multi-modal anomaly detection and reasoning.
1 code implementation • 10 Mar 2024 • Huaxin Zhang, Xiang Wang, Xiaohao Xu, Xiaonan Huang, Chuchu Han, Yuehuan Wang, Changxin Gao, Shanjun Zhang, Nong Sang
In recent years, video anomaly detection has been extensively investigated in both unsupervised and weakly supervised settings to alleviate costly temporal labeling.
2 code implementations • 7 Mar 2024 • Xiang Li, Kai Qiu, Jinglu Wang, Xiaohao Xu, Rita Singh, Kashu Yamazak, Hao Chen, Xiaonan Huang, Bhiksha Raj
Referring perception, which aims at grounding visual objects with multimodal referring guidance, is essential for bridging the gap between humans, who provide instructions, and the environment where intelligent systems perceive.
1 code implementation • 12 Feb 2024 • Xiaohao Xu, Tianyi Zhang, Sibo Wang, Xiang Li, Yongqi Chen, Ye Li, Bhiksha Raj, Matthew Johnson-Roberson, Xiaonan Huang
To this end, we propose a novel, customizable pipeline for noisy data synthesis, aimed at assessing the resilience of multi-modal SLAM models against various perturbations.
no code implementations • 29 Jan 2024 • Yunkang Cao, Xiaohao Xu, Jiangning Zhang, Yuqi Cheng, Xiaonan Huang, Guansong Pang, Weiming Shen
Visual Anomaly Detection (VAD) endeavors to pinpoint deviations from the concept of normality in visual data, widely applied across diverse domains, e. g., industrial defect inspection, and medical lesion detection.
1 code implementation • 16 Jan 2024 • Zhaoge Liu, Xiaohao Xu, Yunkang Cao, Weiming Shen
Knowledge distillation is the process of transferring knowledge from a more powerful large model (teacher) to a simpler counterpart (student).
1 code implementation • 25 Nov 2023 • Wenqiao Li, Xiaohao Xu, Yao Gu, Bozhong Zheng, Shenghua Gao, Yingna Wu
During testing, the point cloud repeatedly goes through the Mask Reconstruction Network, with each iteration's output becoming the next input.
no code implementations • 23 Nov 2023 • Xiaohao Xu
This work proposes a unified self-supervised pre-training framework for transferable multi-modal perception representation learning via masked multi-modal reconstruction in Neural Radiance Field (NeRF), namely NeRF-Supervised Masked AutoEncoder (NS-MAE).
1 code implementation • 5 Nov 2023 • Yunkang Cao, Xiaohao Xu, Chen Sun, Xiaonan Huang, Weiming Shen
This study explores the use of GPT-4V(ision), a powerful visual-linguistic model, to address anomaly detection tasks in a generic manner.
3 code implementations • 29 Sep 2023 • Xiang Li, Jinglu Wang, Xiaohao Xu, Xiulian Peng, Rita Singh, Yan Lu, Bhiksha Raj
We propose a semantic decomposition method based on product quantization, where the multi-source semantics can be decomposed and represented by several disentangled and noise-suppressed single-source semantics.
1 code implementation • 21 Sep 2023 • Shilin Yan, Xiaohao Xu, Renrui Zhang, Lingyi Hong, Wenchao Chen, Wenqiang Zhang, Wei zhang
Our dataset poses new challenges in panoramic VOS and we hope that our PanoVOS can advance the development of panoramic segmentation/tracking.
1 code implementation • 24 Aug 2023 • Huaxin Zhang, Xiang Wang, Xiaohao Xu, Zhiwu Qing, Changxin Gao, Nong Sang
For snippet-level learning, we introduce an online-updated memory to store reliable snippet prototypes for each class.
Ranked #1 on Weakly Supervised Action Localization on BEOID
1 code implementation • 15 Jun 2023 • Yunkang Cao, Xiaohao Xu, Chen Sun, Yuqi Cheng, Liang Gao, Weiming Shen
This technical report introduces the winning solution of the team Segment Any Anomaly for the CVPR2023 Visual Anomaly and Novelty Detection (VAND) challenge.
2 code implementations • 18 May 2023 • Yunkang Cao, Xiaohao Xu, Chen Sun, Yuqi Cheng, Zongwei Du, Liang Gao, Weiming Shen
We present a novel framework, i. e., Segment Any Anomaly + (SAA+), for zero-shot anomaly segmentation with hybrid prompt regularization to improve the adaptability of modern foundation models.
Ranked #1 on Anomaly Detection on KSDD2
2 code implementations • 23 Mar 2023 • Yunkang Cao, Xiaohao Xu, Weiming Shen
The 3D and 2D modality features are aggregated to obtain the CPMF for PCD anomaly detection.
Ranked #1 on Depth Anomaly Detection and Segmentation on MVTEC 3D-AD (using extra training data)
3D Anomaly Detection and Segmentation Depth Anomaly Detection and Segmentation
1 code implementation • IEEE Transactions on Industrial Informatics 2023 • Yunkang Cao, Xiaohao Xu, Zhaoge Liu, Weiming Shen
CDO introduces a margin optimization module and an overlap optimization module to optimize the two key factors determining the localization performance, i. e., the margin and the overlap between the discrepancy distributions (DDs) of normal and abnormal samples.
Ranked #1 on Anomaly Detection on MVTEC 3D-AD (using extra training data)
no code implementations • ICCV 2023 • Xiang Li, Jinglu Wang, Xiaohao Xu, Xiao Li, Bhiksha Raj, Yan Lu
Our model achieves state-of-the-art performance on R-VOS benchmarks, Ref-DAVIS17 and Ref-Youtube-VOS, and also our RRYTVOS dataset.
1 code implementation • 22 Jul 2022 • Xiaohao Xu, Zihao Du, Huaxin Zhang, Ruichao Zhang, Zihan Hong, Qin Huang, Bin Han
To study the effectiveness of our optimization algorithm, a dataset for mechanical maintenance tasks using FMG armbands with 16 sensors is collected.
no code implementations • 12 Jul 2022 • Xiang Li, Jinglu Wang, Xiaohao Xu, Bhiksha Raj, Yan Lu
We propose a robust context fusion network to tackle VIS in an online fashion, which predicts instance segmentation frame-by-frame with a few preceding frames.
1 code implementation • 4 Jul 2022 • Xiang Li, Jinglu Wang, Xiaohao Xu, Xiao Li, Bhiksha Raj, Yan Lu
Referring Video Object Segmentation (R-VOS) is a challenging task that aims to segment an object in a video based on a linguistic expression.
Ranked #11 on Referring Video Object Segmentation on Refer-YouTube-VOS
Referring Expression Segmentation Referring Video Object Segmentation +2
1 code implementation • 2 Jul 2022 • Xiaohao Xu, Jinglu Wang, Xiang Ming, Yan Lu
We consolidate this conditional mask calibration process in a progressive manner, where the object representations and proto-masks evolve to be discriminative iteratively.
Ranked #1 on Visual Object Tracking on YouTube-VOS
1 code implementation • 6 Dec 2021 • Xiaohao Xu, Jinglu Wang, Xiao Li, Yan Lu
We introduce two modulators, propagation and correction modulators, to separately perform channel-wise re-calibration on the target frame embeddings according to local temporal correlations and reliable references respectively.
Ranked #3 on Video Object Segmentation on DAVIS 2017 (test-dev)