Best Arm Identification in Linear Bandits with Linear Dimension Dependency

Authors: Chao Tao, Saúl Blanco, Yuan Zhou

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We finally empirically evaluate our algorithms with extensive synthetic and real-world data-sets, and compare the sample complexity and run time with other state-of-the-art algorithms in Section 5. We test our algorithms Y-ELIMTIL1(X, X, δ) and ALBA(X, δ), and compare them with the state-of-the-art algorithms XY-Allocation (Figure 2 of (Soare et al., 2014)) and XY-Adaptive (Figure 3 of (Soare et al., 2014)). We test the algorithms using both synthetic (similar to that in (Soare et al., 2014), and random data) and real-world data. We fix the confidence parameter δ = 0.05, and report the total number of samples and time used by each algorithm and their empirical failure probabilities (i.e. to fail to identify the best arm). For each setting, the reported numbers are averaged over 100 runs.
Researcher Affiliation Academia 1Department of Computer Science, Indiana University at Bloomington, Indiana, USA 2Institute for Theoretical Computer Science, Shanghai University of Finance and Economics, Shanghai, China 3Department of Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Illinois, USA.
Pseudocode Yes Algorithm 1 VECTOREST(S, n), Algorithm 2 ELIMTILp(S, δ), Algorithm 3 Y-ELIMTILp(S, T, δ), Algorithm 4 Adaptive Linear Best Arm, ALBA(X, δ)
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We now work with candidate arms generated from the advertisement placement dataset provided in (Lefortier et al., 2016).
Dataset Splits No The paper describes a best arm identification problem in linear bandits, which involves sequential sampling rather than traditional supervised learning with train/validation/test dataset splits. Therefore, the concept of a 'validation split' as typically defined for model training is not applicable or mentioned in the text.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies No The paper states 'All algorithms are implemented in Python 3,' but does not provide specific version numbers for Python libraries, frameworks, or other software dependencies.
Experiment Setup Yes In the implementation of our algorithms, we use the entropic mirror descent method introduced in (Beck & Teboulle, 2003) to compute the optimal solution λ X defined in Proposition 6(a) (see Appendix G for details). In the XY-Adaptive algorithm, we set α = 1/10 following the choice made in (Soare et al., 2014). All algorithms are implemented in Python 3, and are tested without parallelization. We fix the confidence parameter δ = 0.05, and report the total number of samples and time used by each algorithm and their empirical failure probabilities (i.e. to fail to identify the best arm). For each setting, the reported numbers are averaged over 100 runs.