Non-Asymptotic Analysis of a UCB-based Top Two Algorithm
Authors: Marc Jourdan, Rémy Degenne
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our main contribution is to propose the first non-asymptotic analysis of Top Two algorithms. Our experiments reveal that TTUCB performs on par with existing Top Two algorithms, which are only proven to be asymptotically β-optimal, even for large sets of arms. Numerically, we show that considering adaptive proportions compared to a fixed β = 1/2 yields a significant speed-up on hard instances, and to a moderate improvement on random instances. We assess the performance on 5000 random Gaussian instances with K = 10 such that µ1 = 0.6 and µi U([0.2, 0.5]) for all i = 1. The CPU running time is reported in Table 4, and the observed empirical errors before stopping is displayed in Figure 3 (Appendix G.2). |
| Researcher Affiliation | Academia | Marc Jourdan marc.jourdan@inria.fr Rémy Degenne remy.degenne@inria.fr 1 Univ. Lille, CNRS, Inria, Centrale Lille, UMR 9189-CRISt AL, F-59000 Lille, France |
| Pseudocode | Yes | Algorithm 1 TTUCB |
| Open Source Code | No | The paper states 'Our code is implemented in Julia 1.7.2' and mentions a 'Readme.md file' with 'detailed julia instructions to reproduce our experiments', implying code availability. It also points to a base library: 'The general structure of the code (and some functions) is taken from the tidnabbil library. This library was created by [12], see https://bitbucket.org/wmkoolen/tidnabbil.'. However, it does not provide a direct link to *their specific implementation's* repository or explicitly state that *their* code is provided in supplementary materials. |
| Open Datasets | No | The paper describes generating '5000 random Gaussian instances' and '1-sparse scenario' data, citing [20] for the scenario. It also describes 'α = 0.3 scenarios' and 'α = 0.6 scenarios' by defining µi based on K. These are descriptions of how data instances are generated or parameterized, not references to pre-existing, publicly accessible datasets with specific links, DOIs, or citations that provide data files. |
| Dataset Splits | No | The paper focuses on online bandit algorithms where data is collected sequentially, and does not specify traditional fixed training, validation, or test dataset splits. No specific percentages or sample counts for these splits are mentioned. |
| Hardware Specification | No | The paper states: 'Experiments presented in this paper were carried out using the Grid 5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations'. While 'Grid 5000' is a computing infrastructure, no specific hardware components (e.g., CPU or GPU models, memory specifications) used within this testbed are provided. |
| Software Dependencies | No | The paper mentions: 'Our code is implemented in Julia 1.7.2, and the plots are generated with the Stats Plots.jl package.' While Julia has a version number (1.7.2), the 'Stats Plots.jl package' is mentioned without a specific version number. This does not meet the criteria of providing specific version numbers for multiple key software components. |
| Experiment Setup | Yes | In the moderate regime (δ = 0.1), we assess the empirical performance of TTUCB with bonus gm and concentration parameters s = α = 1.2. Top Two algorithms and β-LUCB use β = 1/2. We initialize by sampling each arms once. The general structure of the code (and some functions) is taken from the tidnabbil library. To allow for a fair numerical comparison, LUCB and β-LUCB use p2c(n 1, δ)/Nn,i as bonus. |