Optimal Batched Best Arm Identification
Authors: Tianyuan Jin, Yu Yang, Jing Tang, Xiaokui Xiao, Pan Xu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also conduct numerical experiments to compare our proposed algorithms with the optimal sequential algorithm Track-and-Stop [17], and the batched algorithm Top-k δ-Elimination [22] on various problem instances. |
| Researcher Affiliation | Academia | Tianyuan Jin1, Yu Yang3, Jing Tang2, Xiaokui Xiao1, Pan Xu3 1National University of Singapore 2The Hong Kong University of Science and Technology (Guangzhou) 3Duke University {tianyuan,xkxiao}@nus.edu.sg, jingtang@ust.hk, {yu.yang,pan.xu}@duke.edu |
| Pseudocode | Yes | Algorithm 1: Three-Batch Best Arm Identification (Tri-BBAI) |
| Open Source Code | Yes | The implementation of this work can be found at https://github.com/panxulab/Optimal-Batched-Best-Arm-Identification |
| Open Datasets | No | For all experiments in this section, we set the number of arms n = 10, where each arm has Bernoulli reward distribution with mean µi for i [10]. More specifically, the mean rewards are generated by the following two cases. Uniform: The best arm has µ1 = 0.5, and the mean rewards of the rest of the arms follow uniform distribution over [0.2, 0.4], i.e., µi is uniformly generated from [0.2, 0.4] for i [n] {1}. Normal: The best arm has µ1 = 0.6, and the mean rewards of the rest of the arms are first generated from normal distribution N(0.2, 0.2) and then projected to the interval [0, 0.4]. |
| Dataset Splits | No | The paper describes how the reward distributions for the bandit arms are generated for experiments but does not specify any explicit training, validation, or test dataset splits (e.g., 80/10/10 split or specific sample counts). |
| Hardware Specification | Yes | We perform all computations in Python on R9 5900HX for all our experiments. |
| Software Dependencies | No | The paper mentions performing computations 'in Python' but does not specify the Python version or any other software dependencies with their respective version numbers (e.g., specific libraries or frameworks). |
| Experiment Setup | Yes | The hyperparameters of all methods are chosen as follows... For Tri-BBAI and Opt-BBAI, we set α = 1.0017, and ϵ = 0.01. We use the same β(t) function for Chernoff s stopping condition as in Track-and-Stop. Moreover, for the lengths of the batches, we set L1, L2 and L3 to be the value calculated by Theorem 3.1. |