Search Results for author: Stanislaw Jastrzebski

Found 8 papers, 4 papers with code

Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization

no code implementations • 28 Dec 2020 • Stanislaw Jastrzebski, Devansh Arpit, Oliver Astrand, Giancarlo Kerg, Huan Wang, Caiming Xiong, Richard Socher, Kyunghyun Cho, Krzysztof Geras

The early phase of training a deep neural network has a dramatic effect on the local curvature of the loss function.

Memorization

Paper
Add Code

Differences between human and machine perception in medical diagnosis

1 code implementation • 28 Nov 2020 • Taro Makino, Stanislaw Jastrzebski, Witold Oleszkiewicz, Celin Chacko, Robin Ehrenpreis, Naziya Samreen, Chloe Chhor, Eric Kim, Jiyon Lee, Kristine Pysarenko, Beatriu Reig, Hildegard Toth, Divya Awal, Linda Du, Alice Kim, James Park, Daniel K. Sodickson, Laura Heacock, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

We compare the two with respect to their robustness to Gaussian low-pass filtering, performing a subgroup analysis on microcalcifications and soft tissue lesions.

Breast Cancer Detection Medical Diagnosis

Paper
Code

Can Wikipedia Categories Improve Masked Language Model Pretraining?

no code implementations • WS 2020 • Diksha Meghwal, Katharina Kann, Iacer Calixto, Stanislaw Jastrzebski

Pretrained language models have obtained impressive results for a large set of natural language understanding tasks.

Language Modelling Natural Language Understanding

Paper
Add Code

We Should at Least Be Able to Design Molecules That Dock Well

1 code implementation • 20 Jun 2020 • Tobiasz Cieplinski, Tomasz Danel, Sabina Podlewska, Stanislaw Jastrzebski

To close this gap, we propose a benchmark based on docking, a popular computational method for assessing molecule binding to a protein.

Drug Discovery

Paper
Code

The Break-Even Point on Optimization Trajectories of Deep Neural Networks

no code implementations • ICLR 2020 • Stanislaw Jastrzebski, Maciej Szymczak, Stanislav Fort, Devansh Arpit, Jacek Tabor, Kyunghyun Cho, Krzysztof Geras

We argue for the existence of the "break-even" point on this trajectory, beyond which the curvature of the loss surface and noise in the gradient are implicitly regularized by SGD.

Paper
Add Code

Large Scale Structure of Neural Network Loss Landscapes

1 code implementation • NeurIPS 2019 • Stanislav Fort, Stanislaw Jastrzebski

There are many surprising and perhaps counter-intuitive properties of optimization of deep neural networks.

Paper
Code

Parameter-Efficient Transfer Learning for NLP

15 code implementations • 2 Feb 2019 • Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly

On GLUE, we attain within 0. 4% of the performance of full fine-tuning, adding only 3. 6% parameters per task.

Ranked #4 on Image Classification on OmniBenchmark (using extra training data)

Image Classification Text Classification +1

2,423

Paper
Code

Stiffness: A New Perspective on Generalization in Neural Networks

no code implementations • 28 Jan 2019 • Stanislav Fort, Paweł Krzysztof Nowak, Stanislaw Jastrzebski, Srini Narayanan

In particular, we study how stiffness depends on 1) class membership, 2) distance between data points in the input space, 3) training iteration, and 4) learning rate.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.