On Elimination Strategies for Bandit Fixed-Confidence Identification
Authors: Andrea Tirinzoni, Rémy Degenne
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We confirm these benefits experimentally, where elimination improves significantly the computational complexity of adaptive methods on common tasks like best-arm identification in linear bandits. Experimentally, we compare several existing algorithms for three identification problems (BAI, Top-m, and thresholding bandits) on two bandit structures (linear and unstructured). |
| Researcher Affiliation | Collaboration | Andrea Tirinzoni Meta AI Paris, France tirinzoni@fb.com Rémy Degenne Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRISt AL, F-59000 Lille, France remy.degenne@inria.fr |
| Pseudocode | Yes | Algorithm 1 Track-and-Stop [4]: vanilla (left) and with selective elimination (right) |
| Open Source Code | Yes | Our code is available at https://github.com/AndreaTirinzoni/bandit-elimination. |
| Open Datasets | No | The paper uses randomly generated instances for its experiments, rather than a pre-existing public dataset. For example, in Appendix G, it states: 'All experiments use randomly generated instances. For linear bandits, we generate θ by drawing each component from a standard Gaussian and then normalizing it to be unit length...' |
| Dataset Splits | No | The paper does not describe traditional training, validation, and testing splits. Instead, it generates instances and averages results over multiple runs, as stated in Appendix G: 'We generated 100 instances and report average performance across them.' |
| Hardware Specification | No | The paper discusses the computational complexity and provides runtime measurements, but it does not specify the hardware used for running the experiments (e.g., specific GPU or CPU models, memory, or cloud instance types). The 'Checklist' section only vaguely refers to 'internal cluster' without any specific details. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9') that would be required for replication. It mentions baselines that are implementations of certain algorithms, but no specific software environment details. |
| Experiment Setup | Yes | All experiments use δ = 0.01 and are averaged over 100 runs. We ran experiments on two bandit structures: linear (where d < K) and unstructured (where K = d and the arms are the canonical basis of Rd). For each of them, we considered 3 pure exploration problems: BAI, Top-m, and online sign identification (OSI) [9, 27], also called thresholding bandits. We selected a larger linear instance with K = 50 and d = 20, randomly generated (see the protocol in Appendix G). |