Search Results for author: Ding Liang

Found 37 papers, 18 papers with code

TripoSR: Fast 3D Object Reconstruction from a Single Image

1 code implementation • 4 Mar 2024 • Dmitry Tochilkin, David Pankratz, Zexiang Liu, Zixuan Huang, Adam Letts, Yangguang Li, Ding Liang, Christian Laforte, Varun Jampani, Yan-Pei Cao

This technical report introduces TripoSR, a 3D reconstruction model leveraging transformer architecture for fast feed-forward 3D generation, producing 3D mesh from a single image in under 0. 5 seconds.

3D Generation 3D Object Reconstruction From A Single Image +2

3,778

Paper
Code

UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation

no code implementations • 14 Dec 2023 • Zexiang Liu, Yangguang Li, Youtian Lin, Xin Yu, Sida Peng, Yan-Pei Cao, Xiaojuan Qi, Xiaoshui Huang, Ding Liang, Wanli Ouyang

Recent advancements in text-to-3D generation technology have significantly advanced the conversion of textual descriptions into imaginative well-geometrical and finely textured 3D objects.

3D Generation Text to 3D

Paper
Add Code

Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers

1 code implementation • 14 Dec 2023 • Zi-Xin Zou, Zhipeng Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Yan-Pei Cao, Song-Hai Zhang

Recent advancements in 3D reconstruction from single images have been driven by the evolution of generative models.

3D Reconstruction Decoder +1

638

Paper
Code

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

no code implementations • 11 Dec 2023 • Zehuan Huang, Hao Wen, Junting Dong, Yaohui Wang, Yangguang Li, Xinyuan Chen, Yan-Pei Cao, Ding Liang, Yu Qiao, Bo Dai, Lu Sheng

Generating multiview images from a single view facilitates the rapid generation of a 3D mesh conditioned on a single image.

SSIM

Paper
Add Code

Text-to-3D with Classifier Score Distillation

no code implementations • 30 Oct 2023 • Xin Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Song-Hai Zhang, Xiaojuan Qi

In this paper, we re-evaluate the role of classifier-free guidance in score distillation and discover a surprising finding: the guidance alone is enough for effective text-to-3D generation tasks.

3D Generation Text to 3D +1

Paper
Add Code

TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise

no code implementations • 29 Oct 2023 • Nan He, Hanyu Lai, Chenyang Zhao, Zirui Cheng, Junting Pan, Ruoyu Qin, Ruofan Lu, Rui Lu, Yunchen Zhang, Gangming Zhao, Zhaohui Hou, Zhiyuan Huang, Shaoqing Lu, Ding Liang, Mingjie Zhan

Based on TeacherLM-7. 1B, we augmented 58 NLP datasets and taught various student models with different parameters from OPT and BLOOM series in a multi-task setting.

Data Augmentation Language Modelling

Paper
Add Code

Reconstruct Before Summarize: An Efficient Two-Step Framework for Condensing and Summarizing Meeting Transcripts

no code implementations • 13 May 2023 • Haochen Tan, Han Wu, Wei Shao, Xinyun Zhang, Mingjie Zhan, Zhaohui Hou, Ding Liang, Linqi Song

Meetings typically involve multiple participants and lengthy conversations, resulting in redundant and trivial content.

Language Modelling Meeting Summarization +1

Paper
Add Code

VCSUM: A Versatile Chinese Meeting Summarization Dataset

1 code implementation • 9 May 2023 • Han Wu, Mingjie Zhan, Haochen Tan, Zhaohui Hou, Ding Liang, Linqi Song

Compared to news and chat summarization, the development of meeting summarization is hugely decelerated by the limited data.

Meeting Summarization Retrieval +1

Paper
Code

Towards Prompt-robust Face Privacy Protection via Adversarial Decoupling Augmentation Framework

no code implementations • 6 May 2023 • Ruijia Wu, Yuhang Wang, Huafeng Shi, Zhipeng Yu, Yichao Wu, Ding Liang

In this paper, we propose the Adversarial Decoupling Augmentation Framework (ADAF), addressing these issues by targeting the image-text fusion module to enhance the defensive performance of facial privacy protection algorithms.

Denoising

Paper
Add Code

ICD-Face: Intra-class Compactness Distillation for Face Recognition

no code implementations • ICCV 2023 • Zhipeng Yu, Jiaheng Liu, Haoyu Qin, Yichao Wu, Kun Hu, Jiayi Tian, Ding Liang

Knowledge distillation is an effective model compression method to improve the performance of a lightweight student model by transferring the knowledge of a well-performed teacher model, which has been widely adopted in many computer vision tasks, including face recognition (FR).

Face Recognition Knowledge Distillation +1

Paper
Add Code

Improving Robust Fairness via Balance Adversarial Training

no code implementations • 15 Sep 2022 • ChunYu Sun, Chenye Xu, Chengyuan Yao, Siyuan Liang, Yichao Wu, Ding Liang, Xianglong Liu, Aishan Liu

Adversarial training (AT) methods are effective against adversarial attacks, yet they introduce severe disparity of accuracy and robustness between different classes, known as the robust fairness problem.

Fairness

Paper
Add Code

Universal Backdoor Attacks Detection via Adaptive Adversarial Probe

no code implementations • 12 Sep 2022 • Yuhang Wang, Huafeng Shi, Rui Min, Ruijia Wu, Siyuan Liang, Yichao Wu, Ding Liang, Aishan Liu

Most detection methods are designed to verify whether a model is infected with presumed types of backdoor attacks, yet the adversary is likely to generate diverse backdoor attacks in practice that are unforeseen to defenders, which challenge current detection strategies.

Scheduling

Paper
Add Code

Action Recognition With Motion Diversification and Dynamic Selection

no code implementations • TIP 2022 • Peiqin Zhuang, Yu Guo, Zhipeng Yu, Luping Zhou, Lei Bai, Ding Liang, Zhiyong Wang, Yali Wang, Wanli Ouyang

To address this issue, we introduce a Motion Diversification and Selection (MoDS) module to generate diversified spatio-temporal motion features and then select the suitable motion representation dynamically for categorizing the input video.

Ranked #18 on Action Recognition on Something-Something V1

Action Recognition Computational Efficiency

Paper
Add Code

DTG-SSOD: Dense Teacher Guidance for Semi-Supervised Object Detection

1 code implementation • 12 Jul 2022 • Gang Li, Xiang Li, Yujie Wang, Yichao Wu, Ding Liang, Shanshan Zhang

Specifically, we propose the Inverse NMS Clustering (INC) and Rank Matching (RM) to instantiate the dense supervision, without the widely used, conventional sparse pseudo labels.

object-detection Object Detection +1

131

Paper
Code

Learning Locality and Isotropy in Dialogue Modeling

1 code implementation • 29 May 2022 • Han Wu, Haochen Tan, Mingjie Zhan, Gangming Zhao, Shaoqing Lu, Ding Liang, Linqi Song

Existing dialogue modeling methods have achieved promising performance on various dialogue tasks with the aid of Transformer and the large-scale pre-trained language models.

Paper
Code

CoupleFace: Relation Matters for Face Recognition Distillation

no code implementations • 12 Apr 2022 • Jiaheng Liu, Haoyu Qin, Yichao Wu, Jinyang Guo, Ding Liang, Ke Xu

In this work, we observe that mutual relation knowledge between samples is also important to improve the discriminative ability of the learned representation of the student model, and propose an effective face recognition distillation method called CoupleFace by additionally introducing the Mutual Relation Distillation (MRD) into existing distillation framework.

Face Recognition Knowledge Distillation +1

Paper
Add Code

PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection

1 code implementation • 30 Mar 2022 • Gang Li, Xiang Li, Yujie Wang, Yichao Wu, Ding Liang, Shanshan Zhang

Specifically, for pseudo labeling, existing works only focus on the classification score yet fail to guarantee the localization precision of pseudo boxes; For consistency training, the widely adopted random-resize training only considers the label-level consistency but misses the feature-level one, which also plays an important role in ensuring the scale invariance.

Ranked #5 on Semi-Supervised Object Detection on COCO 100% labeled data

object-detection Object Detection +1

131

Paper
Code

Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation

no code implementations • 9 Dec 2021 • Gang Li, Xiang Li, Yujie Wang, Shanshan Zhang, Yichao Wu, Ding Liang

Based on the two observations, we propose Rank Mimicking (RM) and Prediction-guided Feature Imitation (PFI) for distilling one-stage detectors, respectively.

Image Classification Knowledge Distillation +3

Paper
Add Code

One to Transfer All: A Universal Transfer Framework for Vision Foundation Model with Few Data

no code implementations • 24 Nov 2021 • Yujie Wang, Junqin Huang, Mengya Gao, Yichao Wu, Zhenfei Yin, Ding Liang, Junjie Yan

Transferring with few data in a general way to thousands of downstream tasks is becoming a trend of the foundation model's application.

Paper
Add Code

INTERN: A New Learning Paradigm Towards General Vision

no code implementations • 16 Nov 2021 • Jing Shao, Siyu Chen, Yangguang Li, Kun Wang, Zhenfei Yin, Yinan He, Jianing Teng, Qinghong Sun, Mengya Gao, Jihao Liu, Gengshi Huang, Guanglu Song, Yichao Wu, Yuming Huang, Fenggang Liu, Huan Peng, Shuo Qin, Chengyu Wang, Yujie Wang, Conghui He, Ding Liang, Yu Liu, Fengwei Yu, Junjie Yan, Dahua Lin, Xiaogang Wang, Yu Qiao

Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society.

Paper
Add Code

CycleMLP: A MLP-like Architecture for Dense Prediction

8 code implementations • ICLR 2022 • Shoufa Chen, Enze Xie, Chongjian Ge, Runjian Chen, Ding Liang, Ping Luo

We build a family of models which surpass existing MLPs and even state-of-the-art Transformer-based models, e. g., Swin Transformer, while using fewer parameters and FLOPs.

Ranked #15 on Semantic Segmentation on DensePASS

Image Classification Instance Segmentation +4

1,189

Paper
Code

PVT v2: Improved Baselines with Pyramid Vision Transformer

16 code implementations • 25 Jun 2021 • Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao

We hope this work will facilitate state-of-the-art Transformer researches in computer vision.

Ranked #23 on Object Detection on COCO-O

Image Classification Object Detection +1

30,080

Paper
Code

GroupLink: An End-to-end Multitask Method for Word Grouping and Relation Extraction in Form Understanding

no code implementations • 10 May 2021 • Zilong Wang, Mingjie Zhan, Houxing Ren, Zhaohui Hou, Yuwei Wu, Xingyan Zhang, Ding Liang

Forms are a common type of document in real life and carry rich information through textual contents and the organizational structure.

Optical Character Recognition (OCR) Relation +1

Paper
Add Code

PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text

1 code implementation • 2 May 2021 • Wenhai Wang, Enze Xie, Xiang Li, Xuebo Liu, Ding Liang, Zhibo Yang, Tong Lu, Chunhua Shen

By systematically comparing with existing scene text representations, we show that our kernel representation can not only describe arbitrarily-shaped text but also well distinguish adjacent text.

Scene Text Detection Text Detection +1

434

Paper
Code

Inter-class Discrepancy Alignment for Face Recognition

no code implementations • 2 Mar 2021 • Jiaheng Liu, Yudong Wu, Yichao Wu, Zhenmao Li, Chen Ken, Ding Liang, Junjie Yan

In this study, we make a key observation that the local con-text represented by the similarities between the instance and its inter-class neighbors1plays an important role forFR.

Face Recognition

Paper
Add Code

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

9 code implementations • ICCV 2021 • Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao

Unlike the recently-proposed Transformer model (e. g., ViT) that is specially designed for image classification, we propose Pyramid Vision Transformer~(PVT), which overcomes the difficulties of porting Transformer to various dense prediction tasks.

Ranked #5 on Semantic Segmentation on SynPASS

Image Classification Instance Segmentation +3

28,038

Paper
Code

Segmenting Transparent Object in the Wild with Transformer

2 code implementations • 21 Jan 2021 • Enze Xie, Wenjia Wang, Wenhai Wang, Peize Sun, Hang Xu, Ding Liang, Ping Luo

This work presents a new fine-grained transparent object segmentation dataset, termed Trans10K-v2, extending Trans10K-v1, the first large-scale transparent object segmentation dataset.

Ranked #3 on Semantic Segmentation on Trans10K

Decoder Object +3

1,189

Paper
Code

DAM: Discrepancy Alignment Metric for Face Recognition

no code implementations • ICCV 2021 • Jiaheng Liu, Yudong Wu, Yichao Wu, Chuming Li, Xiaolin Hu, Ding Liang, Mengyu Wang

To estimate the LID of each face image in the verification process, we propose two types of LID Estimation (LIDE) methods, which are reference-based and learning-based estimation methods, respectively.

Face Recognition

Paper
Add Code

DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding

no code implementations • Findings of the Association for Computational Linguistics 2020 • Zilong Wang, Mingjie Zhan, Xuebo Liu, Ding Liang

The table detection and handcrafted features in previous works cannot apply to all forms because of their requirements on formats.

Optical Character Recognition (OCR) Table Detection

Paper
Add Code

AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting

2 code implementations • ECCV 2020 • Wenhai Wang, Xuebo Liu, Xiaozhong Ji, Enze Xie, Ding Liang, Zhibo Yang, Tong Lu, Chunhua Shen, Ping Luo

Unlike previous works that merely employed visual features for text detection, this work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter), which learns both visual and linguistic features to significantly reduce ambiguity in text detection.

Language Modelling Sentence +2

Paper
Code

Scene Text Image Super-Resolution in the Wild

4 code implementations • ECCV 2020 • Wenjia Wang, Enze Xie, Xuebo Liu, Wenhai Wang, Ding Liang, Chunhua Shen, Xiang Bai

For example, it outperforms LapSRN by over 5% and 8%on the recognition accuracy of ASTER and CRNN.

Image Super-Resolution

423

Paper
Code

PolarMask: Single Shot Instance Segmentation with Polar Representation

2 code implementations • CVPR 2020 • Enze Xie, Peize Sun, Xiaoge Song, Wenhai Wang, Ding Liang, Chunhua Shen, Ping Luo

In this paper, we introduce an anchor-box free and single shot instance segmentation method, which is conceptually simple, fully convolutional and can be used as a mask prediction module for instance segmentation, by easily embedding it into most off-the-shelf detection methods.

Ranked #100 on Instance Segmentation on COCO test-dev

Distance regression Instance Segmentation +4

869

Paper
Code

Knowledge Distillation via Route Constrained Optimization

1 code implementation • ICCV 2019 • Xiao Jin, Baoyun Peng, Yi-Chao Wu, Yu Liu, Jiaheng Liu, Ding Liang, Xiaolin Hu

However, we find that the representation of a converged heavy model is still a strong constraint for training a small student model, which leads to a high lower bound of congruence loss.

Face Recognition Knowledge Distillation

574

Paper
Code

Pyramid Mask Text Detector

1 code implementation • 28 Mar 2019 • Jingchao Liu, Xuebo Liu, Jie Sheng, Ding Liang, Xin Li, Qingjie Liu

Scene text detection, an essential step of scene text recognition system, is to locate text instances in natural scene images automatically.

Ranked #1 on Scene Text Detection on ICDAR 2017 MLT

Clustering Instance Segmentation +4

Paper
Code

Dynamic Multi-path Neural Network

no code implementations • 28 Feb 2019 • Yingcheng Su, Shunfeng Zhou, Yi-Chao Wu, Tian Su, Ding Liang, Jiaheng Liu, Dixin Zheng, Yingxu Wang, Junjie Yan, Xiaolin Hu

Although deeper and larger neural networks have achieved better performance, the complex network structure and increasing computational cost cannot meet the demands of many resource-constrained applications.