Best Arm Identification in Linear Bandits with Linear Dimension Dependency
Authors: Chao Tao, Saúl Blanco, Yuan Zhou
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We finally empirically evaluate our algorithms with extensive synthetic and real-world data-sets, and compare the sample complexity and run time with other state-of-the-art algorithms in Section 5. We test our algorithms Y-ELIMTIL1(X, X, δ) and ALBA(X, δ), and compare them with the state-of-the-art algorithms XY-Allocation (Figure 2 of (Soare et al., 2014)) and XY-Adaptive (Figure 3 of (Soare et al., 2014)). We test the algorithms using both synthetic (similar to that in (Soare et al., 2014), and random data) and real-world data. We fix the confidence parameter δ = 0.05, and report the total number of samples and time used by each algorithm and their empirical failure probabilities (i.e. to fail to identify the best arm). For each setting, the reported numbers are averaged over 100 runs. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Indiana University at Bloomington, Indiana, USA 2Institute for Theoretical Computer Science, Shanghai University of Finance and Economics, Shanghai, China 3Department of Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Illinois, USA. |
| Pseudocode | Yes | Algorithm 1 VECTOREST(S, n), Algorithm 2 ELIMTILp(S, δ), Algorithm 3 Y-ELIMTILp(S, T, δ), Algorithm 4 Adaptive Linear Best Arm, ALBA(X, δ) |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We now work with candidate arms generated from the advertisement placement dataset provided in (Lefortier et al., 2016). |
| Dataset Splits | No | The paper describes a best arm identification problem in linear bandits, which involves sequential sampling rather than traditional supervised learning with train/validation/test dataset splits. Therefore, the concept of a 'validation split' as typically defined for model training is not applicable or mentioned in the text. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states 'All algorithms are implemented in Python 3,' but does not provide specific version numbers for Python libraries, frameworks, or other software dependencies. |
| Experiment Setup | Yes | In the implementation of our algorithms, we use the entropic mirror descent method introduced in (Beck & Teboulle, 2003) to compute the optimal solution λ X defined in Proposition 6(a) (see Appendix G for details). In the XY-Adaptive algorithm, we set α = 1/10 following the choice made in (Soare et al., 2014). All algorithms are implemented in Python 3, and are tested without parallelization. We fix the confidence parameter δ = 0.05, and report the total number of samples and time used by each algorithm and their empirical failure probabilities (i.e. to fail to identify the best arm). For each setting, the reported numbers are averaged over 100 runs. |