PAC Identification of Many Good Arms in Stochastic Multi-Armed Bandits
Authors: Arghya Roy Chaudhuri, Shivaram Kalyanakrishnan
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experimental results in Section 5, and conclude with a discussion in Section 6. |
| Researcher Affiliation | Academia | Department of Computer Science and Engineering, Indian Institute of Technology Bombay, Mumbai 400076, India. |
| Pseudocode | Yes | Algorithm 1 LUCB-k-m: Algorithm to select k (ϵ, m)-optimal arms; Algorithm 2 P3: Algorithm to solve Q-P; Algorithm 3 KQP-1: Algorithm to solve an at most k-equiprobable (k, ρ) instances |
| Open Source Code | No | The paper does not provide a direct statement or link for the open-source code of the described methodology. |
| Open Datasets | No | We take five Bernoulli instance of sizes n = 10, 20, 50, 100, and 200, with the means linearly spaced between 0.999 and 0.001 (both inclusive), and sorted in descending order. No link or citation to a publicly available dataset is provided. |
| Dataset Splits | No | The paper describes bandit instances and sample complexities, but does not provide details on traditional training/validation/test dataset splits as it's not a supervised learning task. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'KL-divergence based confidence bounds' but does not specify any software packages or their version numbers. |
| Experiment Setup | Yes | Setting ϵ = 0.05, δ = 0.001, and m = 0.1 n, we run the experiments and Fixing A = I20, n = 20, m = 10, (k, m, n) instances are given by and varying k {1, 3, 5, 8, 10}. |