Best Arm Identification in Spectral Bandits
Authors: Tomáš Kocák, Aurélien Garivier
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We furthermore illustrate by numerical experiments both the strategy s efficiency and the impact of the smoothness constraint on the sample complexity.Empirical evaluation. Using the same experiment setting as in Section 3.1 we have evaluated Spectral Ta S for 10 different values of R ranging from true R = µTLµ = 0.01 to R = 0.1 and we plotted theoretical stopping time (red line) as well as average and standard deviation of 20 runs of Spectral Ta S algorithm (blue line). Figure 1 shows that the algorithm can utilize the spectral constraint and improve stopping time significantly whenever the value of R is close to the real smoothness of the problem, as suggested by theory. |
| Researcher Affiliation | Academia | Tom aˇs Koc ak , Aur elien Garivier Ecole Normale Sup erieur de Lyon tomas.kocak@gmail.com, aurelien.garivier@ens-lyon.fr |
| Pseudocode | Yes | Algorithm 1 Spectral Ta S |
| Open Source Code | No | The paper does not contain any explicit statements about making the source code available or provide a link to a code repository. |
| Open Datasets | No | The paper describes a simplistic scenario for numerical experiments with a synthetic mean vector µ = (0.9, 0.5, 0.6)T and a graph structure, but does not use or provide concrete access information for a publicly available dataset. |
| Dataset Splits | No | The paper describes numerical experiments and '20 runs of Spectral Ta S algorithm' but does not specify train/validation/test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU or CPU models used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies or their version numbers. |
| Experiment Setup | Yes | Input and initialization: L : graph Laplacian δ : confidence parameter R : upper bound on the smoothness of µ; the choice εs = (K2 + s) 1/2/2 ensures that the number Na(t) of draws of arm a converges almost-surely to ω a(µ) as t goes to infinity; the choice β(t, δ) = log(2t(K 1)/δ) and Aτ+1 = arg maxa [K] ˆµa(τ) yields a probability of failure Pν (Aτ+1 / a (µ)) δ. |