1 code implementation • 11 Aug 2022 • Chuanguang Yang, Zhulin An, Helong Zhou, Linhang Cai, Xiang Zhi, Jiwen Wu, Yongjun Xu, Qian Zhang
MixSKD mutually distills feature maps and probability distributions between the random pair of original images and their mixup images in a meaningful way.
1 code implementation • AAAI 2022 • Linhang Cai, Zhulin An, Chuanguang Yang, Yangchun Yan, Yongjun Xu
In detail, the proposed PGMPF selectively suppresses the gradient of those ”unimportant” parameters via a prior gradient mask generated by the pruning criterion during fine-tuning.
1 code implementation • 7 Sep 2021 • Chuanguang Yang, Zhulin An, Linhang Cai, Yongjun Xu
Each auxiliary branch is guided to learn self-supervision augmented task and distill this distribution from teacher to student.
1 code implementation • 29 Jul 2021 • Chuanguang Yang, Zhulin An, Linhang Cai, Yongjun Xu
We therefore adopt an alternative self-supervised augmented task to guide the network to learn the joint distribution of the original recognition task and self-supervised auxiliary task.
Ranked #20 on Knowledge Distillation on ImageNet
1 code implementation • 26 Apr 2021 • Chuanguang Yang, Zhulin An, Linhang Cai, Yongjun Xu
We present a collaborative learning method called Mutual Contrastive Learning (MCL) for general visual representation learning.
no code implementations • 6 Nov 2020 • Linhang Cai, Zhulin An, Yongjun Xu
Filter pruning is widely used to reduce the computation of deep learning, enabling the deployment of Deep Neural Networks (DNNs) in resource-limited devices.
no code implementations • 19 Oct 2020 • Linhang Cai, Zhulin An, Chuanguang Yang, Yongjun Xu
Network pruning is widely used to compress Deep Neural Networks (DNNs).