no code implementations • 18 May 2021 • Andy Su, Difei Su, John M. Mulvey, H. Vincent Poor
We propose a novel reinforcement learning based framework PoBRL for solving multi-document summarization.
1 code implementation • ICML 2020 • Andy Su, Jayden Ooi, Tyler Lu, Dale Schuurmans, Craig Boutilier
Delusional bias is a fundamental source of error in approximate Q-learning.