no code implementations • 5 Dec 2023 • Zhengyao Jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian
However, the extensive collection of human motion-captured data and the derived datasets of humanoid trajectories, such as MoCapAct, paves the way to tackle these challenges.
1 code implementation • 6 Jun 2023 • Linjie Xu, Zhengyao Jiang, Jinyu Wang, Lei Song, Jiang Bian
Offline reinforcement learning (RL) methodologies enforce constraints on the policy to adhere closely to the behavior policy, thereby stabilizing value learning and mitigating the selection of out-of-distribution (OOD) actions during test time.
1 code implementation • 24 Mar 2023 • Yicheng Luo, Zhengyao Jiang, samuel cohen, Edward Grefenstette, Marc Peter Deisenroth
In this paper, we introduce Optimal Transport Reward labeling (OTR), an algorithm that assigns rewards to offline trajectories, with a few high-quality demonstrations.
1 code implementation • 22 Aug 2022 • Zhengyao Jiang, Tianjun Zhang, Michael Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian
Planning-based reinforcement learning has shown strong performance in tasks in discrete and low-dimensional continuous action spaces.
1 code implementation • 31 May 2022 • Zhengyao Jiang, Tianjun Zhang, Robert Kirk, Tim Rocktäschel, Edward Grefenstette
In this paper, we treat the transition data of the MDP as a graph, and define a novel backup operator, Graph Backup, which exploits this graph structure for better value estimation.
1 code implementation • 8 Feb 2021 • Zhengyao Jiang, Pasquale Minervini, Minqi Jiang, Tim Rocktaschel
In this work, we show that we can incorporate relational inductive biases, encoded in the form of relational graphs, into agents.
1 code implementation • 24 Apr 2019 • Zhengyao Jiang, Shan Luo
Deep reinforcement learning (DRL) has achieved significant breakthroughs in various tasks.
27 code implementations • 30 Jun 2017 • Zhengyao Jiang, Dixing Xu, Jinjun Liang
They are, along with a number of recently reviewed or published portfolio-selection strategies, examined in three back-test experiments with a trading period of 30 minutes in a cryptocurrency market.
3 code implementations • 5 Dec 2016 • Zhengyao Jiang, Jinjun Liang
Portfolio management is the decision-making process of allocating an amount of fund into different financial investment products.