Best Arm Identification in Spectral Bandits

Authors: Tomáš Kocák, Aurélien Garivier

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We furthermore illustrate by numerical experiments both the strategy s efficiency and the impact of the smoothness constraint on the sample complexity.Empirical evaluation. Using the same experiment setting as in Section 3.1 we have evaluated Spectral Ta S for 10 different values of R ranging from true R = µTLµ = 0.01 to R = 0.1 and we plotted theoretical stopping time (red line) as well as average and standard deviation of 20 runs of Spectral Ta S algorithm (blue line). Figure 1 shows that the algorithm can utilize the spectral constraint and improve stopping time significantly whenever the value of R is close to the real smoothness of the problem, as suggested by theory.
Researcher Affiliation Academia Tom aˇs Koc ak , Aur elien Garivier Ecole Normale Sup erieur de Lyon tomas.kocak@gmail.com, aurelien.garivier@ens-lyon.fr
Pseudocode Yes Algorithm 1 Spectral Ta S
Open Source Code No The paper does not contain any explicit statements about making the source code available or provide a link to a code repository.
Open Datasets No The paper describes a simplistic scenario for numerical experiments with a synthetic mean vector µ = (0.9, 0.5, 0.6)T and a graph structure, but does not use or provide concrete access information for a publicly available dataset.
Dataset Splits No The paper describes numerical experiments and '20 runs of Spectral Ta S algorithm' but does not specify train/validation/test dataset splits.
Hardware Specification No The paper does not provide any specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies or their version numbers.
Experiment Setup Yes Input and initialization: L : graph Laplacian δ : confidence parameter R : upper bound on the smoothness of µ; the choice εs = (K2 + s) 1/2/2 ensures that the number Na(t) of draws of arm a converges almost-surely to ω a(µ) as t goes to infinity; the choice β(t, δ) = log(2t(K 1)/δ) and Aτ+1 = arg maxa [K] ˆµa(τ) yields a probability of failure Pν (Aτ+1 / a (µ)) δ.