Anytime optimal algorithms in stochastic multi-armed bandits
Authors: Rémy Degenne, Vianney Perchet
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The performances of this algorithm (as well as another one motivated by the conjectured optimal bound) are evaluated empirically. We show experimentally that the new algorithm presents a clear improvement upon MOSS used with a doubling trick. This section is dedicated to the experimental comparison of the algorithms introduced, to which we add a new algorithm for which we do not provide theoretical analysis but that seems promising in the experiments. |
| Researcher Affiliation | Academia | R emy Degenne REMY.DEGENNE@MATH.UNIV-PARIS-DIDEROT.FR LPMA, Universit e Paris Diderot Vianney Perchet VIANNEY.PERCHET@NORMALESUP.ORG CREST, ENSAE |
| Pseudocode | Yes | Algorithm 1 MOSS-anytime. 1: Input: α > 0. 2: Pull each arm once. 3: For 1 k K, set sk = 1. 4: for t 1 do 5: Pull arm k that maximizes 6: X (k) sk + max(0,log( t Ksk )) 7: Update the number of pulls: sk sk + 1. 8: end for |
| Open Source Code | No | The paper does not provide any explicit statements or links regarding the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper states it uses 'synthetic data' and specifies the reward variables are 'Gaussian with variance σ2 = 1/2'. However, it does not provide any specific access information (link, DOI, repository, or formal citation) for a publicly available dataset. |
| Dataset Splits | No | The paper describes experiments being 'averaged over 100 runs' or 'averaged over 800 runs' on synthetic data, but it does not specify explicit train/validation/test dataset splits with percentages, sample counts, or references to predefined splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, memory, or detailed computer specifications used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python version, specific library versions, or solver versions). |
| Experiment Setup | Yes | In both experiments, α was taken equal to 0.1. |