Pure Exploration with Multiple Correct Answers
Authors: Rémy Degenne, Wouter M. Koolen
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We determine the sample complexity of pure exploration bandit problems with multiple good answers. We derive a lower bound using a new game equilibrium argument. We show how continuity and convexity properties of single-answer problems ensure that the existing Track-and-Stop algorithm has asymptotically optimal sample complexity. However, that convexity is lost when going to the multiple-answer setting. We present a new algorithm which extends Track-and Stop to the multiple-answer case and has asymptotic sample complexity matching the lower bound. |
| Researcher Affiliation | Academia | Rémy Degenne Centrum Wiskunde & Informatica Science Park 123, Amsterdam, NL remy.degenne@cwi.nl Wouter M. Koolen Centrum Wiskunde & Informatica Science Park 123, Amsterdam, NL wmkoolen@cwi.nl |
| Pseudocode | Yes | Algorithm 1 Sticky Track-and-Stop. Input: δ > 0, strict total order on I. |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not mention specific datasets or their public availability for training. |
| Dataset Splits | No | The paper is theoretical and does not mention dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | No | The paper focuses on theoretical algorithm design and analysis, and does not provide specific experimental setup details like hyperparameter values or training configurations. |