Thompson Sampling on Symmetric Alpha-Stable Bandits

Authors: Abhimanyu Dubey, Alex `Sandy' Pentland

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We prove finite-time regret bounds for both algorithms, and demonstrate through a series of experiments the stronger performance of Thompson Sampling in this setting.
Researcher Affiliation Academia Abhimanyu Dubey and Alex Sandy Pentland Massachusetts Institute of Technology {dubeya, pentland}@mit.edu
Pseudocode Yes Algorithm 1 Chambers-Mallows-Stuck Generation, Algorithm 2 α-Thompson Sampling, Algorithm 3 Robust α-Thompson Sampling
Open Source Code No The paper mentions a
Open Datasets No The paper conducts simulations and generates data for its experiments, rather than using a publicly available dataset with a specific link or citation.
Dataset Splits No The paper conducts simulations over a fixed number of iterations (T=5000 or T=15K) and evaluates regret over time, but it does not specify explicit training, validation, or test dataset splits.
Hardware Specification No The paper does not specify any particular hardware used for running experiments. It vaguely mentions
Software Dependencies No The paper does not provide specific software names with version numbers, such as programming languages, libraries, or frameworks used for implementation.
Experiment Setup Yes We run 100 MAB experiments each for all 5 benchmarks for α = 1.8 and α = 1.3, and K = 50 arms, and for each arm, the mean is drawn from [0, 2000] randomly for each experiment, and σ = 2500. Each experiment is run for T = 5000 iterations, and we report the regret averaged over time, i.e. R(t)/t at any time t. Also,