no code implementations • 28 Dec 2023 • Taylan Kargin, Joudi Hajar, Vikrant Malik, Babak Hassibi
Our objective is to identify a control policy that minimizes the worst-case expected regret over an infinite horizon, considering all potential disturbance distributions within the ambiguity set.
no code implementations • 27 Oct 2022 • Taylan Kargin, Fariborz Salehi, Babak Hassibi
The stochastic mirror descent (SMD) algorithm is a general class of training algorithms, which includes the celebrated stochastic gradient descent (SGD), as a special case.
no code implementations • 17 Jun 2022 • Taylan Kargin, Sahin Lale, Kamyar Azizzadenesheli, Anima Anandkumar, Babak Hassibi
By carefully prescribing an early exploration strategy and a policy update rule, we show that TS achieves order-optimal regret in adaptive control of multidimensional stabilizable LQRs.