no code implementations • 27 May 2024 • Chinmaya Kausik, Kevin Tan, Ambuj Tewari
One can leverage offline latent bandit data to learn a complex model for each latent state, so that an agent can simply learn the latent state online to act optimally.
no code implementations • 5 Feb 2024 • Chinmaya Kausik, Mirco Mutti, Aldo Pacchiano, Ambuj Tewari
Both of these can be instrumental in speeding up learning and improving alignment.
no code implementations • 26 May 2023 • Chinmaya Kausik, Kashvi Srivastava, Rishi Sonthalia
Motivated by this, we study supervised denoising and noisy-input regression under distribution shift.
no code implementations • 29 Nov 2022 • Chinmaya Kausik, Yangyi Lu, Kevin Tan, Maggie Makar, Yixin Wang, Ambuj Tewari
Evaluating and optimizing policies in the presence of unobserved confounders is a problem of growing interest in offline reinforcement learning.
1 code implementation • 17 Nov 2022 • Chinmaya Kausik, Kevin Tan, Ambuj Tewari
We present an algorithm for learning mixtures of Markov chains and Markov decision processes (MDPs) from short unlabeled trajectories.