1 code implementation • 17 Jul 2023 • Shiye Lei, Hao Chen, Sen Zhang, Bo Zhao, DaCheng Tao
With the rapid development of Artificial Intelligence Generated Content (AIGC), it has become common practice in many learning tasks to train or fine-tune large models on synthetic data due to the data-scarcity and privacy leakage problems.
1 code implementation • 13 Jan 2023 • Shiye Lei, DaCheng Tao
Dataset distillation, a dataset reduction method, addresses this problem by synthesizing a small typical dataset from substantial data and has attracted much attention from the deep learning community.
no code implementations • 3 Jun 2022 • Shiye Lei, Fengxiang He, Yancheng Yuan, DaCheng Tao
From the theoretical view, two lower bounds based on algorithm DB variability are proposed and do not explicitly depend on the sample size.
no code implementations • 12 Dec 2021 • Shiye Lei, Zhuozhuo Tu, Leszek Rutkowski, Feng Zhou, Li Shen, Fengxiang He, DaCheng Tao
Bayesian neural networks (BNNs) have become a principal approach to alleviate overconfident predictions in deep learning, but they often suffer from scaling issues due to a large number of distribution parameters.
no code implementations • 7 Dec 2021 • Haowen Chen, Fengxiang He, Shiye Lei, DaCheng Tao
The bound scales with the spectral complexity, the dominant factor of which is the spectral norm product of weight matrices.
no code implementations • 29 Sep 2021 • Shiye Lei, Fengxiang He, Yancheng Yuan, DaCheng Tao
Two new notions, algorithm DB variability and $(\epsilon, \eta)$-data DB variability, are proposed to measure the decision boundary variability from the algorithm and data perspectives.
1 code implementation • 14 Jan 2021 • Fengxiang He, Shiye Lei, Jianmin Ji, DaCheng Tao
We then define an {\it activation hash phase chart} to represent the space expanded by {model size}, training time, training sample size, and the encoding properties, which is divided into three canonical regions: {\it under-expressive regime}, {\it critically-expressive regime}, and {\it sufficiently-expressive regime}.
no code implementations • 15 Jun 2019 • Tian Wang, Shiye Lei, Youyou Jiang, Choi Chang, Hichem Snoussi, Guangcun Shan
It is found that, compared to the traditional Parameter Server architecture, our parallel architecture has higher efficiency on temporal action detection task with multiple GPUs, which is suitable for dealing with the tasks of temporal action proposal generation, especially for large datasets of millions of videos.