no code implementations • 15 Apr 2021 • Dennis Lee, Natasha Jaques, Chase Kew, Jiaxing Wu, Douglas Eck, Dale Schuurmans, Aleksandra Faust
We then train agents to minimize the difference between the attention weights that they apply to the environment at each timestep, and the attention of other agents.