2 code implementations • 9 Apr 2024 • Junkai Yan, Yipeng Gao, Qize Yang, Xihan Wei, Xuansong Xie, AnCong Wu, Wei-Shi Zheng
Text-to-3D generation, which synthesizes 3D assets according to an overall text description, has significantly progressed.
1 code implementation • 27 Jul 2022 • Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Xian-Sheng Hua, Lei Zhang
For 3D video-based tasks such as action recognition, however, directly applying spatiotemporal transformers on video data will bring heavy computation and memory burdens due to the largely increased number of patches and the quadratic complexity of self-attention computation.
Ranked #9 on Action Recognition on Diving-48
1 code implementation • 15 Jun 2022 • Yuxuan Zhou, Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Lei Zhang, Margret Keuper, Xiansheng Hua
Unlike convolutional inductive biases, which are forced to focus exclusively on hard-coded local regions, our proposed SPs are learned by the model itself and take a variety of spatial relations into account.
Ranked #154 on Image Classification on <h2>oi</h2>
no code implementations • CVPR 2021 • Qize Yang, Xihan Wei, Biao Wang, Xian-Sheng Hua, Lei Zhang
Specifically, to alleviate the instability among the detection results in different iterations, we propose using nonmaximum suppression to fuse the detection results from different iterations.
no code implementations • 23 Jan 2020 • Canyu Le, Zhonggui Chen, Xihan Wei, Biao Wang, Lei Zhang
The goal of few-shot learning is to learn a model that can recognize novel classes based on one or few training data.
no code implementations • 27 Aug 2019 • Canyu Le, Xihan Wei, Biao Wang, Lei Zhang, Zhonggui Chen
To solve these two limits, the deep learning model should not only be able to learn from a few of data, but also incrementally learn new concepts from data stream over time without forgetting the previous knowledge.