Asymptotically Optimal Quantile Pure Exploration for Infinite-Armed Bandits
Authors: Evelyn Xiao-Yue Gong, Mark Sellke
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our goal is to efficiently select a single high quality arm whose average reward is, with probability 1 δ, within ε of being with the top η-fraction of arms. For fixed confidence, we give an algorithm with expected sample complexity O log(1/δ) log(1/η) ηε2 which matches a known lower bound up to the log(1/η) factor. In particular the δ-dependence is optimal and closes a quadratic gap. For fixed budget, we show the asymptotically optimal sample complexity as δ 0 is log(1/δ) log log(1/δ) 2/c. The value of c depends explicitly on the problem parameters (including the unknown arm distribution) through a certain Fisher information distance. Even the strictly super-linear dependence on log(1/δ) was not known and resolves a question of [GM20]. |
| Researcher Affiliation | Academia | Xiao-Yue Gong Carnegie Mellon University exiaoyue@andrew.cmu.edu Mark Sellke Harvard University msellke@fas.harvard.edu |
| Pseudocode | Yes | Algorithm 1: Output ˆα G 1(1 η) ε/3 with probability 1 δ |
| Open Source Code | No | The paper is theoretical and does not mention releasing any source code for the described methodologies. |
| Open Datasets | No | The paper is theoretical and does not use or refer to publicly available datasets in the context of empirical experiments. |
| Dataset Splits | No | This is a theoretical paper and does not involve empirical experiments with training, validation, or test data splits. |
| Hardware Specification | No | This is a theoretical paper and does not describe specific hardware used for experiments. |
| Software Dependencies | No | This is a theoretical paper and does not list specific software dependencies with version numbers for implementation. |
| Experiment Setup | No | This is a theoretical paper and does not include details on experimental setup, hyperparameters, or training configurations. |