Asymptotically Optimal Quantile Pure Exploration for Infinite-Armed Bandits

Authors: Evelyn Xiao-Yue Gong, Mark Sellke

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Our goal is to efficiently select a single high quality arm whose average reward is, with probability 1 δ, within ε of being with the top η-fraction of arms. For fixed confidence, we give an algorithm with expected sample complexity O log(1/δ) log(1/η) ηε2 which matches a known lower bound up to the log(1/η) factor. In particular the δ-dependence is optimal and closes a quadratic gap. For fixed budget, we show the asymptotically optimal sample complexity as δ 0 is log(1/δ) log log(1/δ) 2/c. The value of c depends explicitly on the problem parameters (including the unknown arm distribution) through a certain Fisher information distance. Even the strictly super-linear dependence on log(1/δ) was not known and resolves a question of [GM20].
Researcher Affiliation Academia Xiao-Yue Gong Carnegie Mellon University exiaoyue@andrew.cmu.edu Mark Sellke Harvard University msellke@fas.harvard.edu
Pseudocode Yes Algorithm 1: Output ˆα G 1(1 η) ε/3 with probability 1 δ
Open Source Code No The paper is theoretical and does not mention releasing any source code for the described methodologies.
Open Datasets No The paper is theoretical and does not use or refer to publicly available datasets in the context of empirical experiments.
Dataset Splits No This is a theoretical paper and does not involve empirical experiments with training, validation, or test data splits.
Hardware Specification No This is a theoretical paper and does not describe specific hardware used for experiments.
Software Dependencies No This is a theoretical paper and does not list specific software dependencies with version numbers for implementation.
Experiment Setup No This is a theoretical paper and does not include details on experimental setup, hyperparameters, or training configurations.