1 code implementation • 4 Mar 2024 • Dmitry Tochilkin, David Pankratz, Zexiang Liu, Zixuan Huang, Adam Letts, Yangguang Li, Ding Liang, Christian Laforte, Varun Jampani, Yan-Pei Cao
This technical report introduces TripoSR, a 3D reconstruction model leveraging transformer architecture for fast feed-forward 3D generation, producing 3D mesh from a single image in under 0. 5 seconds.
3D Generation 3D Object Reconstruction From A Single Image +2
no code implementations • 14 Dec 2023 • Zexiang Liu, Yangguang Li, Youtian Lin, Xin Yu, Sida Peng, Yan-Pei Cao, Xiaojuan Qi, Xiaoshui Huang, Ding Liang, Wanli Ouyang
Recent advancements in text-to-3D generation technology have significantly advanced the conversion of textual descriptions into imaginative well-geometrical and finely textured 3D objects.
1 code implementation • 14 Dec 2023 • Zi-Xin Zou, Zhipeng Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Yan-Pei Cao, Song-Hai Zhang
Recent advancements in 3D reconstruction from single images have been driven by the evolution of generative models.
no code implementations • 11 Dec 2023 • Zehuan Huang, Hao Wen, Junting Dong, Yaohui Wang, Yangguang Li, Xinyuan Chen, Yan-Pei Cao, Ding Liang, Yu Qiao, Bo Dai, Lu Sheng
Generating multiview images from a single view facilitates the rapid generation of a 3D mesh conditioned on a single image.
no code implementations • 30 Oct 2023 • Xin Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Song-Hai Zhang, Xiaojuan Qi
In this paper, we re-evaluate the role of classifier-free guidance in score distillation and discover a surprising finding: the guidance alone is enough for effective text-to-3D generation tasks.
no code implementations • 29 Oct 2023 • Nan He, Hanyu Lai, Chenyang Zhao, Zirui Cheng, Junting Pan, Ruoyu Qin, Ruofan Lu, Rui Lu, Yunchen Zhang, Gangming Zhao, Zhaohui Hou, Zhiyuan Huang, Shaoqing Lu, Ding Liang, Mingjie Zhan
Based on TeacherLM-7. 1B, we augmented 58 NLP datasets and taught various student models with different parameters from OPT and BLOOM series in a multi-task setting.
no code implementations • 13 May 2023 • Haochen Tan, Han Wu, Wei Shao, Xinyun Zhang, Mingjie Zhan, Zhaohui Hou, Ding Liang, Linqi Song
Meetings typically involve multiple participants and lengthy conversations, resulting in redundant and trivial content.
1 code implementation • 9 May 2023 • Han Wu, Mingjie Zhan, Haochen Tan, Zhaohui Hou, Ding Liang, Linqi Song
Compared to news and chat summarization, the development of meeting summarization is hugely decelerated by the limited data.
no code implementations • 6 May 2023 • Ruijia Wu, Yuhang Wang, Huafeng Shi, Zhipeng Yu, Yichao Wu, Ding Liang
In this paper, we propose the Adversarial Decoupling Augmentation Framework (ADAF), addressing these issues by targeting the image-text fusion module to enhance the defensive performance of facial privacy protection algorithms.
no code implementations • ICCV 2023 • Zhipeng Yu, Jiaheng Liu, Haoyu Qin, Yichao Wu, Kun Hu, Jiayi Tian, Ding Liang
Knowledge distillation is an effective model compression method to improve the performance of a lightweight student model by transferring the knowledge of a well-performed teacher model, which has been widely adopted in many computer vision tasks, including face recognition (FR).
no code implementations • 15 Sep 2022 • ChunYu Sun, Chenye Xu, Chengyuan Yao, Siyuan Liang, Yichao Wu, Ding Liang, Xianglong Liu, Aishan Liu
Adversarial training (AT) methods are effective against adversarial attacks, yet they introduce severe disparity of accuracy and robustness between different classes, known as the robust fairness problem.
no code implementations • 12 Sep 2022 • Yuhang Wang, Huafeng Shi, Rui Min, Ruijia Wu, Siyuan Liang, Yichao Wu, Ding Liang, Aishan Liu
Most detection methods are designed to verify whether a model is infected with presumed types of backdoor attacks, yet the adversary is likely to generate diverse backdoor attacks in practice that are unforeseen to defenders, which challenge current detection strategies.
no code implementations • TIP 2022 • Peiqin Zhuang, Yu Guo, Zhipeng Yu, Luping Zhou, Lei Bai, Ding Liang, Zhiyong Wang, Yali Wang, Wanli Ouyang
To address this issue, we introduce a Motion Diversification and Selection (MoDS) module to generate diversified spatio-temporal motion features and then select the suitable motion representation dynamically for categorizing the input video.
Ranked #18 on Action Recognition on Something-Something V1
1 code implementation • 12 Jul 2022 • Gang Li, Xiang Li, Yujie Wang, Yichao Wu, Ding Liang, Shanshan Zhang
Specifically, we propose the Inverse NMS Clustering (INC) and Rank Matching (RM) to instantiate the dense supervision, without the widely used, conventional sparse pseudo labels.
1 code implementation • 29 May 2022 • Han Wu, Haochen Tan, Mingjie Zhan, Gangming Zhao, Shaoqing Lu, Ding Liang, Linqi Song
Existing dialogue modeling methods have achieved promising performance on various dialogue tasks with the aid of Transformer and the large-scale pre-trained language models.
no code implementations • 12 Apr 2022 • Jiaheng Liu, Haoyu Qin, Yichao Wu, Jinyang Guo, Ding Liang, Ke Xu
In this work, we observe that mutual relation knowledge between samples is also important to improve the discriminative ability of the learned representation of the student model, and propose an effective face recognition distillation method called CoupleFace by additionally introducing the Mutual Relation Distillation (MRD) into existing distillation framework.
1 code implementation • 30 Mar 2022 • Gang Li, Xiang Li, Yujie Wang, Yichao Wu, Ding Liang, Shanshan Zhang
Specifically, for pseudo labeling, existing works only focus on the classification score yet fail to guarantee the localization precision of pseudo boxes; For consistency training, the widely adopted random-resize training only considers the label-level consistency but misses the feature-level one, which also plays an important role in ensuring the scale invariance.
no code implementations • 9 Dec 2021 • Gang Li, Xiang Li, Yujie Wang, Shanshan Zhang, Yichao Wu, Ding Liang
Based on the two observations, we propose Rank Mimicking (RM) and Prediction-guided Feature Imitation (PFI) for distilling one-stage detectors, respectively.
no code implementations • 24 Nov 2021 • Yujie Wang, Junqin Huang, Mengya Gao, Yichao Wu, Zhenfei Yin, Ding Liang, Junjie Yan
Transferring with few data in a general way to thousands of downstream tasks is becoming a trend of the foundation model's application.
no code implementations • 16 Nov 2021 • Jing Shao, Siyu Chen, Yangguang Li, Kun Wang, Zhenfei Yin, Yinan He, Jianing Teng, Qinghong Sun, Mengya Gao, Jihao Liu, Gengshi Huang, Guanglu Song, Yichao Wu, Yuming Huang, Fenggang Liu, Huan Peng, Shuo Qin, Chengyu Wang, Yujie Wang, Conghui He, Ding Liang, Yu Liu, Fengwei Yu, Junjie Yan, Dahua Lin, Xiaogang Wang, Yu Qiao
Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society.
8 code implementations • ICLR 2022 • Shoufa Chen, Enze Xie, Chongjian Ge, Runjian Chen, Ding Liang, Ping Luo
We build a family of models which surpass existing MLPs and even state-of-the-art Transformer-based models, e. g., Swin Transformer, while using fewer parameters and FLOPs.
Ranked #15 on Semantic Segmentation on DensePASS
16 code implementations • 25 Jun 2021 • Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao
We hope this work will facilitate state-of-the-art Transformer researches in computer vision.
Ranked #23 on Object Detection on COCO-O
no code implementations • 10 May 2021 • Zilong Wang, Mingjie Zhan, Houxing Ren, Zhaohui Hou, Yuwei Wu, Xingyan Zhang, Ding Liang
Forms are a common type of document in real life and carry rich information through textual contents and the organizational structure.
1 code implementation • 2 May 2021 • Wenhai Wang, Enze Xie, Xiang Li, Xuebo Liu, Ding Liang, Zhibo Yang, Tong Lu, Chunhua Shen
By systematically comparing with existing scene text representations, we show that our kernel representation can not only describe arbitrarily-shaped text but also well distinguish adjacent text.
no code implementations • 2 Mar 2021 • Jiaheng Liu, Yudong Wu, Yichao Wu, Zhenmao Li, Chen Ken, Ding Liang, Junjie Yan
In this study, we make a key observation that the local con-text represented by the similarities between the instance and its inter-class neighbors1plays an important role forFR.
9 code implementations • ICCV 2021 • Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao
Unlike the recently-proposed Transformer model (e. g., ViT) that is specially designed for image classification, we propose Pyramid Vision Transformer~(PVT), which overcomes the difficulties of porting Transformer to various dense prediction tasks.
Ranked #5 on Semantic Segmentation on SynPASS
2 code implementations • 21 Jan 2021 • Enze Xie, Wenjia Wang, Wenhai Wang, Peize Sun, Hang Xu, Ding Liang, Ping Luo
This work presents a new fine-grained transparent object segmentation dataset, termed Trans10K-v2, extending Trans10K-v1, the first large-scale transparent object segmentation dataset.
Ranked #3 on Semantic Segmentation on Trans10K
no code implementations • ICCV 2021 • Jiaheng Liu, Yudong Wu, Yichao Wu, Chuming Li, Xiaolin Hu, Ding Liang, Mengyu Wang
To estimate the LID of each face image in the verification process, we propose two types of LID Estimation (LIDE) methods, which are reference-based and learning-based estimation methods, respectively.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Zilong Wang, Mingjie Zhan, Xuebo Liu, Ding Liang
The table detection and handcrafted features in previous works cannot apply to all forms because of their requirements on formats.
2 code implementations • ECCV 2020 • Wenhai Wang, Xuebo Liu, Xiaozhong Ji, Enze Xie, Ding Liang, Zhibo Yang, Tong Lu, Chunhua Shen, Ping Luo
Unlike previous works that merely employed visual features for text detection, this work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter), which learns both visual and linguistic features to significantly reduce ambiguity in text detection.
4 code implementations • ECCV 2020 • Wenjia Wang, Enze Xie, Xuebo Liu, Wenhai Wang, Ding Liang, Chunhua Shen, Xiang Bai
For example, it outperforms LapSRN by over 5% and 8%on the recognition accuracy of ASTER and CRNN.
2 code implementations • CVPR 2020 • Enze Xie, Peize Sun, Xiaoge Song, Wenhai Wang, Ding Liang, Chunhua Shen, Ping Luo
In this paper, we introduce an anchor-box free and single shot instance segmentation method, which is conceptually simple, fully convolutional and can be used as a mask prediction module for instance segmentation, by easily embedding it into most off-the-shelf detection methods.
Ranked #100 on Instance Segmentation on COCO test-dev
1 code implementation • ICCV 2019 • Xiao Jin, Baoyun Peng, Yi-Chao Wu, Yu Liu, Jiaheng Liu, Ding Liang, Xiaolin Hu
However, we find that the representation of a converged heavy model is still a strong constraint for training a small student model, which leads to a high lower bound of congruence loss.
1 code implementation • 28 Mar 2019 • Jingchao Liu, Xuebo Liu, Jie Sheng, Ding Liang, Xin Li, Qingjie Liu
Scene text detection, an essential step of scene text recognition system, is to locate text instances in natural scene images automatically.
Ranked #1 on Scene Text Detection on ICDAR 2017 MLT
no code implementations • 28 Feb 2019 • Yingcheng Su, Shunfeng Zhou, Yi-Chao Wu, Tian Su, Ding Liang, Jiaheng Liu, Dixin Zheng, Yingxu Wang, Junjie Yan, Xiaolin Hu
Although deeper and larger neural networks have achieved better performance, the complex network structure and increasing computational cost cannot meet the demands of many resource-constrained applications.
7 code implementations • CVPR 2018 • Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan
Incidental scene text spotting is considered one of the most difficult and valuable challenges in the document analysis community.
Ranked #4 on Scene Text Detection on ICDAR 2017 MLT
10 code implementations • 3 Feb 2015 • Yi Sun, Ding Liang, Xiaogang Wang, Xiaoou Tang
Very deep neural networks recently achieved great success on general object recognition because of their superb learning capacity.