Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling

Authors: Emilie Kaufmann, Wouter M. Koolen, Aurélien Garivier

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We complement our theoretical guarantees by experiments showing that MS works best in practice. We complement our theoretical guarantees by experiments showing that MS works best in practice. We complement our theoretical guarantees by experiments showing that MS works best in practice. Finally, numerical experiments reported in Section 6 demonstrate the efficiency of Murphy Sampling paired with our new stopping rule.
Researcher Affiliation Academia 1 CNRS & U. Lille, CRISt AL / Seque L Inria Lille, emilie.kaufmann@univ-lille.fr 2 Centrum Wiskunde & Informatica, Amsterdam, wmkoolen@cwi.nl 3 UMPA, École normale supérieure de Lyon, aurelien.garivier@ens-lyon.fr
Pseudocode No The paper describes algorithms such as Murphy Sampling (MS), LCB, and Thompson Sampling (TS) using textual descriptions and mathematical equations (e.g., MS: Sample θt Πt 1 ( H<), then play At = a (θt). (4)), but it does not include formal pseudocode blocks or algorithms.
Open Source Code No The paper does not contain any statement about releasing open-source code for the methodology, nor does it provide any links to a code repository.
Open Datasets No The paper mentions experiments performed on "Gaussian bandits with variance 1" which is a synthetic setup, but does not provide any specific information or links to publicly available datasets (e.g., well-known benchmarks) or how to access this synthetic data.
Dataset Splits No The paper does not provide specific details on dataset splits (e.g., training, validation, test percentages or counts) or refer to standard predefined splits for reproducibility.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. It only states that numerical experiments were performed.
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, frameworks, or specific solvers).
Experiment Setup Yes We discuss the results of numerical experiments performed on Gaussian bandits with variance 1, using the threshold γ = 0. Thompson and Murphy sampling are run using a flat (improper) prior on R, which leads to a conjugate Gaussian posterior.