no code implementations • 12 Feb 2024 • Matthew V Macfarlane, Edan Toledo, Donal Byrne, Siddarth Singh, Paul Duckworth, Alexandre Laterre
SMX demonstrates a statistically significant improvement in performance compared to AlphaZero, as well as demonstrating its performance as an improvement operator for a model-free policy, matching or exceeding top model-free methods across both continuous and discrete environments.
1 code implementation • NeurIPS 2023 • Felix Chalumeau, Shikha Surana, Clement Bonnet, Nathan Grinsztajn, Arnu Pretorius, Alexandre Laterre, Thomas D. Barrett
Combinatorial Optimization underpins many real-world applications and yet, designing performant algorithms to solve these complex, typically NP-hard, problems remains a significant research challenge.
1 code implementation • 16 Jun 2023 • Clément Bonnet, Daniel Luo, Donal Byrne, Shikha Surana, Sasha Abramowitz, Paul Duckworth, Vincent Coyette, Laurence I. Midgley, Elshadai Tegegn, Tristan Kalloniatis, Omayma Mahjoub, Matthew Macfarlane, Andries P. Smit, Nathan Grinsztajn, Raphael Boige, Cemlyn N. Waters, Mohamed A. Mimouni, Ulrich A. Mbou Sob, Ruan de Kock, Siddarth Singh, Daniel Furelos-Blanco, Victor Le, Arnu Pretorius, Alexandre Laterre
Open-source reinforcement learning (RL) environments have played a crucial role in driving progress in the development of AI algorithms.
1 code implementation • 19 Nov 2022 • Clément Bonnet, Laurence Midgley, Alexandre Laterre
This bias comes from using the critic that is trained using the meta-learned discount factor for the advantage estimation in the outer objective which requires a different discount factor.
1 code implementation • 28 May 2022 • Christopher W. F. Parsonson, Alexandre Laterre, Thomas D. Barrett
By retrospectively deconstructing the search tree into multiple paths each contained within a sub-tree, we enable the agent to learn from shorter trajectories with more predictable next states.
1 code implementation • 27 May 2022 • Thomas D. Barrett, Christopher W. F. Parsonson, Alexandre Laterre
Compared to the nearest competitor, ECORD reduces the optimality gap by up to 73% on 500 vertex graphs with a decreased wall-clock time.
no code implementations • 30 Oct 2021 • Clément Bonnet, Paul Caron, Thomas Barrett, Ian Davies, Alexandre Laterre
Self-tuning algorithms that adapt the learning process online encourage more effective and robust learning.
no code implementations • 29 Sep 2021 • Thomas D Barrett, Christopher William Falke Parsonson, Alexandre Laterre
Compared to the nearest competitor, ECORD reduces the optimality gap by up to 73% on 500 vertex graphs with a decreased wall-clock time.
no code implementations • 1 Jan 2021 • Thomas Pierrot, Valentin Macé, Jean-Baptiste Sevestre, Louis Monier, Alexandre Laterre, Nicolas Perrin, Karim Beguir, Olivier Sigaud
Very large action spaces constitute a critical challenge for deep Reinforcement Learning (RL) algorithms.
no code implementations • 1 Jan 2021 • Arnu Pretorius, Scott Cameron, Andries Petrus Smit, Elan van Biljon, Lawrence Francis, Femi Azeez, Alexandre Laterre, Karim Beguir
Furthermore, the core utility of our imagination is deeply coupled with communication.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 3 Dec 2020 • Marcin J. Skwark, Nicolás López Carranza, Thomas Pierrot, Joe Phillips, Slim Said, Alexandre Laterre, Amine Kerkeni, Uğur Şahin, Karim Beguir
This suggests that combining leading protein design methods with modern deep reinforcement learning is a viable path for discovering a Covid-19 cure and may accelerate design of peptide-based therapeutics for other diseases.
no code implementations • 29 Nov 2020 • Louis Monier, Jakub Kmec, Alexandre Laterre, Thomas Pierrot, Valentin Courgeau, Olivier Sigaud, Karim Beguir
Offline Reinforcement Learning (RL) aims to turn large datasets into powerful decision-making engines without any online interactions with the environment.
1 code implementation • NeurIPS 2020 • Arnu Pretorius, Scott Cameron, Elan van Biljon, Tom Makkink, Shahil Mawjee, Jeremy du Plessis, Jonathan Shock, Alexandre Laterre, Karim Beguir
Multi-agent reinforcement learning has recently shown great promise as an approach to networked system control.
no code implementations • 27 Jul 2020 • Thomas Pierrot, Nicolas Perrin, Feryal Behbahani, Alexandre Laterre, Olivier Sigaud, Karim Beguir, Nando de Freitas
Third, the self-models are harnessed to learn recursive compositional programs with multiple levels of abstraction.
1 code implementation • NeurIPS 2019 • Thomas Pierrot, Guillaume Ligner, Scott Reed, Olivier Sigaud, Nicolas Perrin, Alexandre Laterre, David Kas, Karim Beguir, Nando de Freitas
AlphaZero contributes powerful neural network guided search algorithms, which we augment with recursion.
2 code implementations • 4 Jul 2018 • Alexandre Laterre, Yunguan Fu, Mohamed Khalil Jabri, Alain-Sam Cohen, David Kas, Karl Hajjar, Torbjorn S. Dahl, Amine Kerkeni, Karim Beguir
Results from applying the R2 algorithm to instances of a two-dimensional and three-dimensional bin packing problems show that it outperforms generic Monte Carlo tree search, heuristic algorithms and integer programming solvers.