no code implementations • 26 Oct 2021 • Arsenii Kuznetsov, Alexander Grishin, Artem Tsypin, Arsenii Ashukha, Artur Kadurin, Dmitry Vetrov
Overestimation bias control techniques are used by the majority of high-performing off-policy reinforcement learning algorithms.
7 code implementations • 15 Sep 2021 • Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, Anastasia Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka, Kiwoong Park, Victor Lempitsky
We find that one of the main reasons for that is the lack of an effective receptive field in both the inpainting network and the loss function.
Ranked #3 on Seeing Beyond the Visible on KITTI360-EX
no code implementations • 15 Jun 2021 • Arsenii Ashukha, Andrei Atanov, Dmitry Vetrov
Averaging predictions over a set of models -- an ensemble -- is widely used to improve predictive performance and uncertainty estimation of deep learning models.
1 code implementation • 21 Feb 2020 • Dmitry Molchanov, Alexander Lyzhov, Yuliya Molchanova, Arsenii Ashukha, Dmitry Vetrov
Test-time data augmentation$-$averaging the predictions of a machine learning model across multiple augmented samples of data$-$is a widely used technique that improves the predictive performance.
2 code implementations • ICLR 2020 • Arsenii Ashukha, Alexander Lyzhov, Dmitry Molchanov, Dmitry Vetrov
Uncertainty estimation and ensembling methods go hand-in-hand.
3 code implementations • 1 May 2019 • Andrei Atanov, Alexandra Volokhova, Arsenii Ashukha, Ivan Sosnovik, Dmitry Vetrov
This paper proposes a semi-conditional normalizing flow model for semi-supervised learning.
2 code implementations • ICLR 2019 • Andrei Atanov, Arsenii Ashukha, Kirill Struminsky, Dmitry Vetrov, Max Welling
Bayesian inference is known to provide a general framework for incorporating prior knowledge or specific properties into machine learning models via carefully choosing a prior distribution.
2 code implementations • ICLR 2019 • Kirill Neklyudov, Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov
Ordinary stochastic neural networks mostly rely on the expected values of their weights to make predictions, whereas the induced noise is mostly used to capture the uncertainty, prevent overfitting and slightly boost the performance through test-time averaging.
no code implementations • 20 Feb 2018 • Max Kochurov, Timur Garipov, Dmitry Podoprikhin, Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov
In industrial machine learning pipelines, data often arrive in parts.
1 code implementation • 13 Feb 2018 • Andrei Atanov, Arsenii Ashukha, Dmitry Molchanov, Kirill Neklyudov, Dmitry Vetrov
In this work, we investigate Batch Normalization technique and propose its probabilistic interpretation.
5 code implementations • NeurIPS 2017 • Kirill Neklyudov, Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov
In the paper, we propose a new Bayesian model that takes into account the computational structure of neural networks and provides structured sparsity, e. g. removes neurons and/or convolutional channels in CNNs.
15 code implementations • ICML 2017 • Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov
We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout.