no code implementations • 6 Feb 2024 • Mengfan Xu, Diego Klabjan
We study a robust multi-agent multi-armed bandit problem where multiple clients or participants are distributed on a fully decentralized blockchain, with the possibility of some being malicious.
no code implementations • 15 Aug 2023 • Mengfan Xu, Diego Klabjan
Multi-armed Bandit motivates methods with provable upper bounds on regret and also the counterpart lower bounds have been extensively studied in this context.
no code implementations • 1 Dec 2022 • Mengfan Xu, Diego Klabjan
We study Pareto optimality in multi-objective multi-armed bandit by providing a formulation of adversarial multi-objective multi-armed bandit and defining its Pareto regrets that can be applied to both stochastic and adversarial settings.
no code implementations • 21 Mar 2022 • Shu Wan, Chen Zheng, Zhonggen Sun, Mengfan Xu, Xiaoqing Yang, Hongtu Zhu, Jiecheng Guo
We show the effectiveness of GCF by deriving the asymptotic property of the estimator and comparing it to popular uplift modeling methods on both synthetic and real-world datasets.
no code implementations • 29 Sep 2021 • Shu Wan, Chen Zheng, Zhonggen Sun, Mengfan Xu, Xiaoqing Yang, Jiecheng Guo, Hongtu Zhu
Heterogeneous treatment effect (HTE) estimation with continuous treatment is essential in multiple disciplines, such as the online marketplace and pharmaceutical industry.
no code implementations • 20 Sep 2020 • Mengfan Xu, Diego Klabjan
In EXP4-RL, we extend EXP4. P from bandit scenarios to reinforcement learning to incentivize exploration by multiple agents, including one high-performing agent, for both efficiency and excellence.