Multi-Prize Lottery Ticket Hypothesis: Finding Generalizable and Efficient Binary Subnetworks in a Randomly Weighted Neural Network

ICLR 2021 · James Diffenderfer, Bhavya Kailkhura ·

Recently, \cite{frankle2018lottery} demonstrated that randomly-initialized dense networks contain subnetworks that once found can be trained to reach test accuracy comparable to the trained dense network. However, finding these high performing trainable subnetworks is expensive, requiring iterative process of training and pruning weights. In this paper, we propose (and prove) a stronger \emph{Multi-Prize Lottery Ticket Hypothesis}: A sufficiently over-parameterized neural network with random weights contains several subnetworks (winning tickets) that (a) have comparable accuracy to a dense target network with learned weights (prize 1), (b) do not require any further training to achieve prize 1 (prize 2), and (c) is robust to extreme forms of quantization (i.e., binary weights and/or activation) (prize 3). These multi-prize tickets enjoy a number of desirable properties including drastically reduced memory size, faster test-time inference, and lower power consumption compared to their dense and full-precision counterparts. Furthermore, we propose an algorithm for finding multi-prize tickets and test it by performing a series of experiments on CIFAR-10 and ImageNet datasets. Empirical results indicate that as models grow deeper and wider, untrained multi-prize tickets start to reach similar (and sometimes even higher) test accuracy compared to their significantly larger and full-precision counterparts that have been weight-trained. With minimal hyperparameter tuning, our binary weight multi-prize tickets outperform current state-of-the-art in binary neural networks.

PDF Abstract