no code implementations • 2 May 2024 • Youquan Liu, Lingdong Kong, Xiaoyang Wu, Runnan Chen, Xin Li, Liang Pan, Ziwei Liu, Yuexin Ma
A unified and versatile LiDAR segmentation model with strong robustness and generalizability is desirable for safe autonomous driving perception.
1 code implementation • 21 Mar 2024 • Bohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jia
This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module to greatly enhance the adaptivity of sparse CNNs at minimal computational cost.
Ranked #5 on 3D Semantic Segmentation on SemanticKITTI (val mIoU metric)
1 code implementation • 14 Mar 2024 • Chengyao Wang, Li Jiang, Xiaoyang Wu, Zhuotao Tian, Bohao Peng, Hengshuang Zhao, Jiaya Jia
To address this issue, we propose GroupContrast, a novel approach that combines segment grouping and semantic-aware contrastive learning.
no code implementations • 23 Feb 2024 • Francis Engelmann, Ayca Takmaz, Jonas Schult, Elisabetta Fedele, Johanna Wald, Songyou Peng, Xi Wang, Or Litany, Siyu Tang, Federico Tombari, Marc Pollefeys, Leonidas Guibas, Hongbo Tian, Chunjie Wang, Xiaosheng Yan, Bingwen Wang, Xuanyang Zhang, Xiao Liu, Phuc Nguyen, Khoi Nguyen, Anh Tran, Cuong Pham, Zhening Huang, Xiaoyang Wu, Xi Chen, Hengshuang Zhao, Lei Zhu, Joan Lasenby
This report provides an overview of the challenge hosted at the OpenSUN3D Workshop on Open-Vocabulary 3D Scene Understanding held in conjunction with ICCV 2023.
3 code implementations • 15 Dec 2023 • Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao
This paper is not motivated to seek innovation within the attention mechanism.
Ranked #1 on 3D Semantic Segmentation on ScanNet++ (using extra training data)
no code implementations • 6 Dec 2023 • Yunhan Yang, Yukun Huang, Xiaoyang Wu, Yuan-Chen Guo, Song-Hai Zhang, Hengshuang Zhao, Tong He, Xihui Liu
However, due to the lack of information from multiple views, these works encounter difficulties in generating controllable novel views.
1 code implementation • 5 Dec 2023 • Zhangyang Qi, Ye Fang, Zeyi Sun, Xiaoyang Wu, Tong Wu, Jiaqi Wang, Dahua Lin, Hengshuang Zhao
Multimodal Large Language Models (MLLMs) have excelled in 2D image-text comprehension and image generation, but their understanding of the 3D world is notably deficient, limiting progress in 3D language understanding and generation.
1 code implementation • 12 Oct 2023 • Honghui Yang, Sha Zhang, Di Huang, Xiaoyang Wu, Haoyi Zhu, Tong He, Shixiang Tang, Hengshuang Zhao, Qibo Qiu, Binbin Lin, Xiaofei He, Wanli Ouyang
In the context of autonomous driving, the significance of effective feature learning is widely acknowledged.
1 code implementation • 12 Oct 2023 • Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Tong He, Wanli Ouyang
In this paper, we introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation, thereby establishing a pathway to 3D foundational models.
Ranked #2 on Semantic Segmentation on ScanNet (using extra training data)
1 code implementation • 1 Sep 2023 • Zhening Huang, Xiaoyang Wu, Xi Chen, Hengshuang Zhao, Lei Zhu, Joan Lasenby
When integrated with powerful 2D open-world models such as ODISE and GroundingDINO, excellent results were observed on open-vocabulary instance segmentation.
3D Open-Vocabulary Instance Segmentation 3D Open-Vocabulary Object Detection +5
1 code implementation • 18 Aug 2023 • Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao
In contrast, such privilege has not yet fully benefited 3D deep learning, mainly due to the limited availability of large-scale 3D datasets.
Ranked #3 on 3D Semantic Segmentation on SemanticKITTI (val mIoU metric, using extra training data)
1 code implementation • CVPR 2023 • Jiahui Liu, Chirui Chang, Jianhui Liu, Xiaoyang Wu, Lan Ma, Xiaojuan Qi
Unlike the single-scan-based semantic segmentation task, this task requires distinguishing the motion states of points in addition to their semantic categories.
no code implementations • 27 Jun 2023 • Bohao Peng, Zhuotao Tian, Xiaoyang Wu, Chengyao Wang, Shu Liu, Jingyong Su, Jiaya Jia
We hope our work can benefit broader industrial applications where novel classes with limited annotations are required to be decently identified.
1 code implementation • 6 Jun 2023 • Yunhan Yang, Xiaoyang Wu, Tong He, Hengshuang Zhao, Xihui Liu
In this work, we propose SAM3D, a novel framework that is able to predict masks in 3D point clouds by leveraging the Segment-Anything Model (SAM) in RGB images without further training or finetuning.
no code implementations • 2 Jun 2023 • Zhangyang Qi, Jiaqi Wang, Xiaoyang Wu, Hengshuang Zhao
Multi-view 3D object detection is becoming popular in autonomous driving due to its high effectiveness and low cost.
1 code implementation • CVPR 2023 • Bohao Peng, Zhuotao Tian, Xiaoyang Wu, Chenyao Wang, Shu Liu, Jingyong Su, Jiaya Jia
Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting unseen classes with only a handful of annotations.
Ranked #7 on Few-Shot Semantic Segmentation on COCO-20i (1-shot)
1 code implementation • CVPR 2023 • Xiaoyang Wu, Xin Wen, Xihui Liu, Hengshuang Zhao
As a pioneering work, PointContrast conducts unsupervised 3D representation learning via leveraging contrastive learning over raw RGB-D frames and proves its effectiveness on various downstream tasks.
Ranked #10 on Semantic Segmentation on ScanNet (val mIoU metric, using extra training data)
no code implementations • 14 Mar 2023 • Zhening Huang, Xiaoyang Wu, Hengshuang Zhao, Lei Zhu, Shujun Wang, Georgios Hadjidemetriou, Ioannis Brilakis
For feature aggregation, it improves feature modeling by allowing the network to learn from both local points and neighboring geometry partitions, resulting in an enlarged data-tailored receptive field.
2 code implementations • CVPR 2023 • Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia
Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers.
2 code implementations • 11 Oct 2022 • Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao
In this work, we analyze the limitations of the Point Transformer and propose our powerful and efficient Point Transformer V2 model with novel designs that overcome the limitations of previous work.
Ranked #1 on 3D Semantic Segmentation on nuScenes
no code implementations • 16 May 2018 • Xiaodan Song, Jiabao Yao, Lulu Zhou, Li Wang, Xiaoyang Wu, Di Xie, ShiLiang Pu
It aims to design a single CNN model with low redundancy to adapt to decoded frames with different qualities and ensure consistency.
Multimedia