1 code implementation • 9 Jul 2021 • Sampo Kuutti, Saber Fallah, Richard Bowden
By training the protagonist against an ensemble of adversaries, it learns a significantly more robust control policy, which generalises to a variety of adversarial strategies.
1 code implementation • 9 Jul 2021 • Sampo Kuutti, Saber Fallah, Richard Bowden
By penalising the safe action distribution based on its similarity to the unsafe action distribution when training on the collision dataset, a more robust and safe control policy is obtained.
1 code implementation • 23 May 2021 • Marco Visca, Sampo Kuutti, Roger Powell, Yang Gao, Saber Fallah
Terrain traversability analysis plays a major role in ensuring safe robotic navigation in unstructured environments.
1 code implementation • 27 Mar 2021 • Shayan Taherian, Sampo Kuutti, Marco Visca, Saber Fallah
It is shown that, torque-vectoring controller with parameter tuning via reinforcement learning performs well on a range of different driving environment e. g., wide range of friction conditions and different velocities, which highlight the advantages of reinforcement learning as an adaptive algorithm for parameter tuning.
1 code implementation • 17 Mar 2021 • Sampo Kuutti, Richard Bowden, Saber Fallah
We compare models with and without safety cages, as well as models with optimal and constrained model parameters, and show that the weak supervision consistently improves the safety of exploration, speed of convergence, and model performance.
1 code implementation • 27 Feb 2020 • Sampo Kuutti, Saber Fallah, Richard Bowden
As the networks used to obtain state-of-the-art results become increasingly deep and complex, the rules they have learned and how they operate become more challenging to understand.
no code implementations • 23 Dec 2019 • Sampo Kuutti, Richard Bowden, Yaochu Jin, Phil Barber, Saber Fallah
However, deep learning methods have shown great promise in not only providing excellent performance for complex and non-linear control problems, but also in generalising previously learned rules to new scenarios.