reproducibilityindex.ai

Minimal Exploration in Structured Stochastic Bandits

Authors: Richard Combes, Stefan Magureanu, Alexandre Proutiere

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate the efficiency of OSSB using numerical experiments in the case of the linear bandit problem and show that OSSB outperforms existing algorithms, including Thompson sampling.
Researcher Affiliation	Academia	Richard Combes Centrale-Supelec / L2S richard.combes@supelec.fr Stefan Magureanu KTH, EE School / ACL magur@kth.se Alexandre Proutiere KTH, EE School / ACL alepro@kth.se
Pseudocode	Yes	Algorithm 1 OSSB(ε,γ)
Open Source Code	No	The paper does not include any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	No	The paper describes a synthetic experimental setup where parameters were generated uniformly at random, rather than using a pre-existing, publicly available dataset with concrete access information (e.g., URL, DOI, specific citation to an established benchmark).
Dataset Splits	No	The paper describes numerical experiments using synthetically generated parameters and mentions averaging over multiple trials, but it does not specify explicit training, validation, or test dataset splits, or cross-validation methods.
Hardware Specification	No	The paper does not provide any specific hardware details such as CPU/GPU models, memory, or cloud computing specifications used for running the experiments.
Software Dependencies	No	The paper mentions baselines (e.g., Thompson Sampling, GLM-UCB) but does not provide specific version numbers for any software, libraries, or dependencies used in the experiments.
Experiment Setup	Yes	In our implementation of OSSB, we use γ = ε = 0 since γ is typically chosen 0 in the literature (see [18]) and the performance of the algorithm does not appear sensitive to the choice of ε. As baselines we select the extension of Thompson Sampling presented in [4](using vt = R p 0.5dln(t/δ), we chose δ = 0.1, R = 1), GLM-UCB (using ρ(t) = p 0.5ln(t)), an extension of UCB [16] and the algorithm presented in [31].