PAC Identification of Many Good Arms in Stochastic Multi-Armed Bandits

Authors: Arghya Roy Chaudhuri, Shivaram Kalyanakrishnan

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present experimental results in Section 5, and conclude with a discussion in Section 6.
Researcher Affiliation Academia Department of Computer Science and Engineering, Indian Institute of Technology Bombay, Mumbai 400076, India.
Pseudocode Yes Algorithm 1 LUCB-k-m: Algorithm to select k (ϵ, m)-optimal arms; Algorithm 2 P3: Algorithm to solve Q-P; Algorithm 3 KQP-1: Algorithm to solve an at most k-equiprobable (k, ρ) instances
Open Source Code No The paper does not provide a direct statement or link for the open-source code of the described methodology.
Open Datasets No We take five Bernoulli instance of sizes n = 10, 20, 50, 100, and 200, with the means linearly spaced between 0.999 and 0.001 (both inclusive), and sorted in descending order. No link or citation to a publicly available dataset is provided.
Dataset Splits No The paper describes bandit instances and sample complexities, but does not provide details on traditional training/validation/test dataset splits as it's not a supervised learning task.
Hardware Specification No The paper does not provide specific details on the hardware used for running the experiments.
Software Dependencies No The paper mentions using 'KL-divergence based confidence bounds' but does not specify any software packages or their version numbers.
Experiment Setup Yes Setting ϵ = 0.05, δ = 0.001, and m = 0.1 n, we run the experiments and Fixing A = I20, n = 20, m = 10, (k, m, n) instances are given by and varying k {1, 3, 5, 8, 10}.