no code implementations • 23 Sep 2023 • Hector Kohler, Riad Akrour, Philippe Preux
We show in this paper that deep RL can fail even on simple toy tasks of this class.
1 code implementation • 22 Sep 2023 • Hector Kohler, Riad Akrour, Philippe Preux
Finding an optimal decision tree for a supervised learning task is a challenging combinatorial problem to solve at scale.
no code implementations • 11 Apr 2023 • Hector Kohler, Riad Akrour, Philippe Preux
A given supervised classification task is modeled as a Markov decision problem (MDP) and then augmented with additional actions that gather information about the features, equivalent to building a DT.
no code implementations • 16 Oct 2022 • Riccardo Della Vecchia, Alena Shilova, Philippe Preux, Riad Akrour
Compared to these learning frameworks, one of the major difficulties of RL is the absence of i. i. d.
no code implementations • 13 Nov 2020 • Riad Akrour, Asma Atamna, Jan Peters
We then propose an optimization algorithm that follows the gradient of the composition of the objective and the projection and prove its convergence for linear objectives and arbitrary convex and Lipschitz domain defining inequality constraints.
1 code implementation • 10 Jun 2020 • Riad Akrour, Davide Tateo, Jan Peters
Reinforcement learning (RL) has demonstrated its ability to solve high dimensional tasks by leveraging non-linear function approximators.
no code implementations • 29 Jan 2020 • Samuele Tosatto, Riad Akrour, Jan Peters
The Nadaraya-Watson kernel estimator is among the most popular nonparameteric regression technique thanks to its simplicity.
no code implementations • 7 Feb 2019 • Joni Pajarinen, Hong Linh Thai, Riad Akrour, Jan Peters, Gerhard Neumann
Trust-region methods have yielded state-of-the-art results in policy search.
no code implementations • ICML 2017 • Riad Akrour, Dmitry Sorokin, Jan Peters, Gerhard Neumann
Bayesian optimization is renowned for its sample efficiency but its application to higher dimensional tasks is impeded by its focus on global optimization.
no code implementations • 29 Jun 2016 • Riad Akrour, Abbas Abdolmaleki, Hany Abdulsamad, Jan Peters, Gerhard Neumann
In order to show the monotonic improvement of our algorithm, we additionally conduct a theoretical analysis of our policy update scheme to derive a lower bound of the change in policy return between successive iterations.
no code implementations • 5 Aug 2012 • Riad Akrour, Marc Schoenauer, Michèle Sebag
This paper focuses on reinforcement learning (RL) with limited prior knowledge.