Regularization

Weight Reset is an implicit regularization procedure that periodically resets a randomly selected portion of layer weights during the training process, according to predefined probability distributions.

To delineate the Weight Reset procedure, a straightforward formulation is introduced. Assume $\mathcal{B}(p)$ as a multivariate Bernoulli distribution with parameter $p$, and let's propose that $\mathcal{D}$ is an arbitrary distribution used for initializing model weights. At specified intervals (after a certain number of training iterations, except for the last one), a random portion of the weights $W={w^l}$ from selected layers in the neural network undergoes a reset utilizing the following method: $$ \tilde{w}^l = w^l\cdot (1-m) + \xi\cdot m, $$ where $\cdot$ operation is an element-wise hadamar type multiplication, $w^l$ are current weights for layer $l$, $\tilde{w}^l$ are reset weights for this layer, $m \sim \mathcal{B}(p^l)$ is a resetting mask, $p^l$ is a resetting rate for a layer $l$, $\xi \sim \mathcal{D}$ are new random weights.

Evidence has indicated that Weight Reset can compete with, and in some instances, surpass traditional regularization techniques.

Given the observable effects of the Weight Reset technique on an increasing number of weights in a model, there's a plausible hypothesis suggesting its potential association with the Double Descent phenomenon.

Source: The Weights Reset Technique for Deep Neural Networks Implicit Regularization

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Classification 1 100.00%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories