no code implementations • 17 Aug 2023 • Patrick Butlin, Robert Long, Eric Elmoznino, Yoshua Bengio, Jonathan Birch, Axel Constant, George Deane, Stephen M. Fleming, Chris Frith, Xu Ji, Ryota Kanai, Colin Klein, Grace Lindsay, Matthias Michel, Liad Mudrik, Megan A. K. Peters, Eric Schwitzgebel, Jonathan Simon, Rufin VanRullen
From these theories we derive "indicator properties" of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties.
1 code implementation • 30 May 2023 • Kenji Kawaguchi, Zhun Deng, Xu Ji, Jiaoyang Huang
In this paper, we provide the first rigorous learning theory for justifying the benefit of information bottleneck in deep learning by mathematically relating information bottleneck to generalization errors.
no code implementations • 13 Feb 2023 • Xu Ji, Eric Elmoznino, George Deane, Axel Constant, Guillaume Dumas, Guillaume Lajoie, Jonathan Simon, Yoshua Bengio
Conscious states (states that there is something it is like to be in) seem both rich or full of detail, and ineffable or hard to fully describe or recall.
no code implementations • 24 Oct 2022 • Dianbo Liu, Moksh Jain, Bonaventure Dossou, Qianli Shen, Salem Lahlou, Anirudh Goyal, Nikolay Malkin, Chris Emezue, Dinghuai Zhang, Nadhir Hassen, Xu Ji, Kenji Kawaguchi, Yoshua Bengio
These methods face two important challenges: (a) the posterior distribution over masks can be highly multi-modal which can be difficult to approximate with standard variational inference and (b) it is not trivial to fully utilize sample-dependent information and correlation among dropout masks to improve posterior estimation.
1 code implementation • 2 Oct 2022 • Nikolay Malkin, Salem Lahlou, Tristan Deleu, Xu Ji, Edward Hu, Katie Everett, Dinghuai Zhang, Yoshua Bengio
This paper builds bridges between two families of probabilistic algorithms: (hierarchical) variational inference (VI), which is typically used to model distributions over continuous spaces, and generative flow networks (GFlowNets), which have been used for distributions over discrete structures such as graphs.
no code implementations • 18 Sep 2022 • Bonaventure F. P. Dossou, Dianbo Liu, Xu Ji, Moksh Jain, Almer M. van der Sloot, Roger Palou, Michael Tyers, Yoshua Bengio
As antibiotic-resistant bacterial strains are rapidly spreading worldwide, infections caused by these strains are emerging as a global crisis causing the death of millions of people every year.
no code implementations • 2 Feb 2022 • Dianbo Liu, Alex Lamb, Xu Ji, Pascal Notsawo, Mike Mozer, Yoshua Bengio, Kenji Kawaguchi
Vector Quantization (VQ) is a method for discretizing latent representations and has become a major part of the deep learning toolkit.
no code implementations • 6 Dec 2021 • Xu Ji, Lena Nehale-Ezzine, Maksym Korablyov
Compact data representations are one approach for improving generalization of learned functions.
1 code implementation • 15 Jun 2021 • Xu Ji, Razvan Pascanu, Devon Hjelm, Balaji Lakshminarayanan, Andrea Vedaldi
Intuitively, one would expect accuracy of a trained neural network's prediction on test samples to correlate with how densely the samples are surrounded by seen training samples in representation space.
1 code implementation • 22 Jun 2020 • Xu Ji, Joao Henriques, Tinne Tuytelaars, Andrea Vedaldi
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
1 code implementation • CVPR 2020 • Sylvestre-Alvise Rebuffi, Ruth Fong, Xu Ji, Andrea Vedaldi
Saliency methods seek to explain the predictions of a model by producing an importance map across each input sample.
no code implementations • 19 Oct 2019 • Sylvestre-Alvise Rebuffi, Ruth Fong, Xu Ji, Hakan Bilen, Andrea Vedaldi
In this paper, we are rather interested by the locations of an image that contribute to the model's training.
no code implementations • 6 Nov 2018 • German I. Parisi, Xu Ji, Stefan Wermter
Lifelong learning capabilities are crucial for artificial autonomous agents operating on real-world data, which is typically non-stationary and temporally correlated.
6 code implementations • ICCV 2019 • Xu Ji, João F. Henriques, Andrea Vedaldi
The method is not specialised to computer vision and operates on any paired dataset samples; in our experiments we use random transforms to obtain a pair from each image.
Ranked #1 on Unsupervised MNIST on MNIST