Search Results for author: Joshua Achiam

Found 9 papers, 2 papers with code

Responsive Safety in Reinforcement Learning

no code implementations • ICML 2020 • Adam Stooke, Joshua Achiam, Pieter Abbeel

This intuition leads to our introduction of PID control for the Lagrange multiplier in constrained RL, which we cast as a dynamical system.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

A Hazard Analysis Framework for Code Synthesis Large Language Models

no code implementations • 25 Jul 2022 • Heidy Khlaaf, Pamela Mishkin, Joshua Achiam, Gretchen Krueger, Miles Brundage

Codex, a large language model (LLM) trained on a variety of codebases, exceeds the previous state of the art in its capacity to synthesize and generate code.

Code Generation Language Modelling +1

Paper
Add Code

Responsive Safety in Reinforcement Learning by PID Lagrangian Methods

no code implementations • 8 Jul 2020 • Adam Stooke, Joshua Achiam, Pieter Abbeel

Lagrangian methods are widely used algorithms for constrained optimization problems, but their learning dynamics exhibit oscillations and overshoot which, when applied to safe reinforcement learning, leads to constraint-violating behavior during agent training.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Towards Characterizing Divergence in Deep Q-Learning

no code implementations • 21 Mar 2019 • Joshua Achiam, Ethan Knight, Pieter Abbeel

Deep Q-Learning (DQL), a family of temporal difference algorithms for control, employs three techniques collectively known as the `deadly triad' in reinforcement learning: bootstrapping, off-policy learning, and function approximation.

Continuous Control OpenAI Gym +1

Paper
Add Code

Variational Option Discovery Algorithms

no code implementations • 26 Jul 2018 • Joshua Achiam, Harrison Edwards, Dario Amodei, Pieter Abbeel

We explore methods for option discovery based on variational inference and make two algorithmic contributions.

Decoder Variational Inference

Paper
Add Code

On First-Order Meta-Learning Algorithms

13 code implementations • 8 Mar 2018 • Alex Nichol, Joshua Achiam, John Schulman

This paper considers meta-learning problems, where there is a distribution of tasks, and we would like to obtain an agent that performs well (i. e., learns quickly) when presented with a previously unseen task sampled from this distribution.

Ranked #3 on Image Classification on Tiered ImageNet 5-way (5-shot)

Few-Shot Image Classification Few-Shot Learning

2,547

Paper
Code

Constrained Policy Optimization

9 code implementations • ICML 2017 • Joshua Achiam, David Held, Aviv Tamar, Pieter Abbeel

For many applications of reinforcement learning it can be more convenient to specify both a reward function and constraints, rather than trying to design behavior through the reward function.

Reinforcement Learning (RL) Safe Reinforcement Learning

288

Paper
Code

Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning

no code implementations • 6 Mar 2017 • Joshua Achiam, Shankar Sastry

Exploration in complex domains is a key challenge in reinforcement learning, especially for tasks with very sparse rewards.

Continuous Control reinforcement-learning +1

Paper
Add Code

Easy Monotonic Policy Iteration

no code implementations • 29 Feb 2016 • Joshua Achiam

A key problem in reinforcement learning for control with general function approximators (such as deep neural networks and other nonlinear functions) is that, for many algorithms employed in practice, updates to the policy or $Q$-function may fail to improve performance---or worse, actually cause the policy performance to degrade.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.