Replicable Bandits

Authors: Hossein Esfandiari, Alkis Kalavasis, Amin Karbasi, Andreas Krause, Vahab Mirrokni, Grigoris Velegkas

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we provide some experimental evaluation of our proposed algorithms in the setting of multi-armed stochastic bandits.
Researcher Affiliation Collaboration Hossein Esfandiari Google Research esfandiari@google.com Alkis Kalavasis National Technical University of Athens kalavasisalkis@mail.ntua.gr Amin Karbasi Yale University, Google Research amin.karbasi@yale.edu Andreas Krause ETH Zurich krausea@ethz.ch Vahab Mirrokni Google Research mirrokni@google.com Grigoris Velegkas Yale University grigoris.velegkas@yale.edu
Pseudocode Yes Algorithm 1 Mean-Estimation Based Replicable Algorithm for Stochastic MAB (Theorem 3)
Open Source Code No No statement explicitly providing access to source code for the methodology, nor a link to a repository.
Open Datasets No We consider a multi-armed bandit setting with K = 6 arms that have Bernoulli rewards with bias (0.44, 0.47, 0.5, 0.53, 0.56, 0.59)
Dataset Splits No The paper describes a simulated multi-armed bandit environment over T rounds, but does not specify training/validation/test dataset splits from an existing dataset.
Hardware Specification No No specific hardware (e.g., GPU models, CPU types, or memory) used for experiments is mentioned.
Software Dependencies No No specific software dependencies with version numbers are provided.
Experiment Setup Yes Experimental Setup. We consider a multi-armed bandit setting with K = 6 arms that have Bernoulli rewards with bias (0.44, 0.47, 0.5, 0.53, 0.56, 0.59), we run the algorithms for T = 45000 iterations and we execute both algorithms 20 different times. For our algorithm, we set its replicability parameter ρ = 0.3.