no code implementations • 21 Jan 2024 • Yihong Guo, Hao liu, Yisong Yue, Anqi Liu
Central to our methodology is the application of robust regression, a distributionally robust technique tailored here to improve the estimation of conditional reward distribution from logging data.