Search Results for author: Jie Bian

Found 2 papers, 0 papers with code

Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits

no code implementations24 May 2024 Jie Bian, Vincent Y. F. Tan

The Indexed Minimum Empirical Divergence (IMED) algorithm is a highly effective approach that offers a stronger theoretical guarantee of the asymptotic optimality compared to the Kullback--Leibler Upper Confidence Bound (KL-UCB) algorithm for the multi-armed bandit problem.

Multi-Armed Bandits Thompson Sampling

Maillard Sampling: Boltzmann Exploration Done Optimally

no code implementations5 Nov 2021 Jie Bian, Kwang-Sung Jun

This less-known algorithm, which we call Maillard sampling (MS), computes the probability of choosing each arm in a \textit{closed form}, which is not true for Thompson sampling, a widely-adopted bandit algorithm in the industry.

counterfactual Thompson Sampling

Cannot find the paper you are looking for? You can Submit a new open access paper.