no code implementations • 12 Jun 2021 • MohammadJavad Azizi, Sheldon M Ross, Zhengyu Zhang
We propose to use the classical "vector at a time" (VT) rule, which samples each remaining arm once in each round.
Multi-Armed Bandits