Anytime optimal algorithms in stochastic multi-armed bandits

Authors: Rémy Degenne, Vianney Perchet

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The performances of this algorithm (as well as another one motivated by the conjectured optimal bound) are evaluated empirically. We show experimentally that the new algorithm presents a clear improvement upon MOSS used with a doubling trick. This section is dedicated to the experimental comparison of the algorithms introduced, to which we add a new algorithm for which we do not provide theoretical analysis but that seems promising in the experiments.
Researcher Affiliation Academia R emy Degenne REMY.DEGENNE@MATH.UNIV-PARIS-DIDEROT.FR LPMA, Universit e Paris Diderot Vianney Perchet VIANNEY.PERCHET@NORMALESUP.ORG CREST, ENSAE
Pseudocode Yes Algorithm 1 MOSS-anytime. 1: Input: α > 0. 2: Pull each arm once. 3: For 1 k K, set sk = 1. 4: for t 1 do 5: Pull arm k that maximizes 6: X (k) sk + max(0,log( t Ksk )) 7: Update the number of pulls: sk sk + 1. 8: end for
Open Source Code No The paper does not provide any explicit statements or links regarding the availability of open-source code for the described methodology.
Open Datasets No The paper states it uses 'synthetic data' and specifies the reward variables are 'Gaussian with variance σ2 = 1/2'. However, it does not provide any specific access information (link, DOI, repository, or formal citation) for a publicly available dataset.
Dataset Splits No The paper describes experiments being 'averaged over 100 runs' or 'averaged over 800 runs' on synthetic data, but it does not specify explicit train/validation/test dataset splits with percentages, sample counts, or references to predefined splits.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, memory, or detailed computer specifications used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python version, specific library versions, or solver versions).
Experiment Setup Yes In both experiments, α was taken equal to 0.1.