Search Results for author: Takashi Mori

Found 8 papers, 0 papers with code

Interplay between depth of neural networks and locality of target functions

no code implementations28 Jan 2022 Takashi Mori, Masahito Ueda

It has been recognized that heavily overparameterized deep neural networks (DNNs) exhibit surprisingly good generalization performance in various machine-learning tasks.

Learning Theory

Logarithmic landscape and power-law escape rate of SGD

no code implementations29 Sep 2021 Takashi Mori, Liu Ziyin, Kangqiao Liu, Masahito Ueda

Stochastic gradient descent (SGD) undergoes complicated multiplicative noise for the mean-square loss.

Power-law escape rate of SGD

no code implementations20 May 2021 Takashi Mori, Liu Ziyin, Kangqiao Liu, Masahito Ueda

Stochastic gradient descent (SGD) undergoes complicated multiplicative noise for the mean-square loss.

Metastability associated with many-body explosion of eigenmode expansion coefficients

no code implementations11 Feb 2021 Takashi Mori

Metastable states in stochastic systems are often characterized by the presence of small eigenvalues in the generator of the stochastic dynamics.

Statistical Mechanics Disordered Systems and Neural Networks

Strength of Minibatch Noise in SGD

no code implementations ICLR 2022 Liu Ziyin, Kangqiao Liu, Takashi Mori, Masahito Ueda

The noise in stochastic gradient descent (SGD), caused by minibatch sampling, is poorly understood despite its practical importance in deep learning.

Rigorous Bounds on the Heating Rate in Thue-Morse Quasiperiodically and Randomly Driven Quantum Many-Body Systems

no code implementations18 Jan 2021 Takashi Mori, Hongzheng Zhao, Florian Mintert, Johannes Knolle, Roderich Moessner

The nonequilibrium quantum dynamics of closed many-body systems is a rich yet challenging field.

Statistical Mechanics Quantum Physics

Improved generalization by noise enhancement

no code implementations28 Sep 2020 Takashi Mori, Masahito Ueda

Recent studies have demonstrated that noise in stochastic gradient descent (SGD) is closely related to generalization: A larger SGD noise, if not too large, results in better generalization.

Is deeper better? It depends on locality of relevant features

no code implementations26 May 2020 Takashi Mori, Masahito Ueda

It is shown that the NTK does not correctly capture the depth dependence of the generalization performance, which indicates the importance of the feature learning rather than the lazy learning.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.