Non-Asymptotic Pure Exploration by Solving Games
Authors: Rémy Degenne, Wouter M. Koolen, Pierre Ménard
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our approach empirically in benchmark experiments at practical δ, and find that our algorithms are either competitive with Track-and-Stop (dense w ) or dominate it (sparse w ). |
| Researcher Affiliation | Academia | Rémy Degenne Centrum Wiskunde & Informatica Science Park 123, 1098 XG Amsterdam remy.degenne@cwi.nl |
| Pseudocode | Yes | Algorithm 1 Pure exploration meta-algorithm. |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper describes experiments on 'Bernoulli bandit model' and 'Gaussian bandit model' with specified parameters, indicating a simulated environment rather than the use of a pre-existing publicly available dataset that would require a link or citation for access. |
| Dataset Splits | No | The paper operates within a multi-armed bandit framework where data is sampled sequentially, and therefore, does not discuss traditional training, validation, and test dataset splits as found in static dataset-based experiments. |
| Hardware Specification | No | The paper states 'The experiments were carried out on the Dutch national e-infrastructure with the support of SURF Cooperative,' but it does not provide specific hardware details such as GPU/CPU models, memory, or processor types. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers required to replicate the experiments. |
| Experiment Setup | Yes | We use stylised stopping threshold β(δ, t) = ln 1+ln t / δ and exploration bonus f(t) = ln t. Both are unlicensed by theory yet conservative in practise (the error frequency is way below δ). |