Evidence against implicitly recurrent computations in residual neural networks

1 Jan 2021  ·  Samuel Lippl, Benjamin Peters, Nikolaus Kriegeskorte ·

Recent work on residual neural networks (ResNets) has suggested that a ResNet's deep feedforward computation may be characterized as implicitly recurrent in that it iteratively refines the same representation like a recurrent network. To test this hypothesis, we manipulate the degree of weight sharing across layers in ResNets using soft gradient coupling. This new method, which provides a form of recurrence regularization, can interpolate smoothly between an ordinary ResNet and a ``"recurrent" ResNet (i.e., one that uses identical weights across layers and thus could be physically implemented with a recurrent network computing the successive stages iteratively across time). We define three indices of recurrent iterative computation and show that a higher degree of gradient coupling promotes iterative convergent computation in ResNets. To measure the degree of weight sharing, we quantify the effective number of parameters of models along the continuum between nonrecurrent and recurrent. For a given effective number of parameters, recurrence regularization does not improve classification accuracy on three visual recognition tasks (MNIST, CIFAR-10, Digitclutter). ResNets, thus, may not benefit from a more similar sets of weights across layers, suggesting that their power does not derive from implicitly recurrent computation.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods