Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits

Authors: Dorian Baudry, Fabien Pesquerel, Rémy Degenne, Odalric-Ambrym Maillard

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, in Section 4 we perform numerical simulations that confirm the benefits of our novel algorithms in terms of computation time, and show their strong empirical performance.
Researcher Affiliation Academia Dorian Baudry Ecole Polytechnique, CREST Palaiseau, France dorian.baudry@ensae.fr Fabien Pesquerel Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189-CRISt AL, F-59000 Lille, France fabien.pesquerel@inria.fr Rémy Degenne Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189-CRISt AL, F-59000 Lille, France remy.degenne@inria.fr Odalric-Ambrym Maillard Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189-CRISt AL, F-59000 Lille, France odalric.maillard@inria.fr
Pseudocode Yes We provide a condensed implementation of OIMED in Algorithm 1, and the detailed implementation of OMED in Appendix A.2 (Algorithm 6).
Open Source Code Yes Our code is available in the supplementary material of the paper.
Open Datasets Yes The dataset is available in the supplementary material of the paper.
Dataset Splits No The paper discusses experimental evaluations but does not explicitly provide details on training, validation, or test dataset splits, percentages, or cross-validation setups.
Hardware Specification No The paper mentions 'Python implementation' for run times but does not specify any particular hardware components (e.g., CPU, GPU models, memory) used for the experiments.
Software Dependencies No The paper mentions a 'Python implementation' and refers to 'Soft-Bayes' as a portfolio selection algorithm, but it does not provide specific version numbers for Python, Soft-Bayes, or any other software libraries.
Experiment Setup Yes We illustrate the stability of OIMED on three bandit settings: the DSSAT bandit problem and Bernoulli problem that were introduced in the main Section 4 and a Beta bandit problem where all the means are centered around 0.5 and the same as in the Bernoulli bandit. ... We will replace this original η by ηn = r q / 4n where r will range from 0.01 to 100.