Search Results for author: Danil Provodin

Found 5 papers, 3 papers with code

Provably Efficient Exploration in Constrained Reinforcement Learning:Posterior Sampling Is All You Need

no code implementations27 Sep 2023 Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

We present a new algorithm based on posterior sampling for learning in constrained Markov decision processes (CMDP) in the infinite-horizon undiscounted setting.

Efficient Exploration

Bandits for Sponsored Search Auctions under Unknown Valuation Model: Case Study in E-Commerce Advertising

no code implementations31 Mar 2023 Danil Provodin, Jérémie Joudioux, Eduard Duryev

This formulation assumes that the bidder's value is unknown, evolving arbitrarily, and observed only upon winning an auction.

The Impact of Batch Learning in Stochastic Linear Bandits

1 code implementation14 Feb 2022 Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

Our main theoretical results show that the impact of batch learning is a multiplicative factor of batch size relative to the regret of online behavior.

Cannot find the paper you are looking for? You can Submit a new open access paper.