Replicable Bandits
Authors: Hossein Esfandiari, Alkis Kalavasis, Amin Karbasi, Andreas Krause, Vahab Mirrokni, Grigoris Velegkas
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we provide some experimental evaluation of our proposed algorithms in the setting of multi-armed stochastic bandits. |
| Researcher Affiliation | Collaboration | Hossein Esfandiari Google Research esfandiari@google.com Alkis Kalavasis National Technical University of Athens kalavasisalkis@mail.ntua.gr Amin Karbasi Yale University, Google Research amin.karbasi@yale.edu Andreas Krause ETH Zurich krausea@ethz.ch Vahab Mirrokni Google Research mirrokni@google.com Grigoris Velegkas Yale University grigoris.velegkas@yale.edu |
| Pseudocode | Yes | Algorithm 1 Mean-Estimation Based Replicable Algorithm for Stochastic MAB (Theorem 3) |
| Open Source Code | No | No statement explicitly providing access to source code for the methodology, nor a link to a repository. |
| Open Datasets | No | We consider a multi-armed bandit setting with K = 6 arms that have Bernoulli rewards with bias (0.44, 0.47, 0.5, 0.53, 0.56, 0.59) |
| Dataset Splits | No | The paper describes a simulated multi-armed bandit environment over T rounds, but does not specify training/validation/test dataset splits from an existing dataset. |
| Hardware Specification | No | No specific hardware (e.g., GPU models, CPU types, or memory) used for experiments is mentioned. |
| Software Dependencies | No | No specific software dependencies with version numbers are provided. |
| Experiment Setup | Yes | Experimental Setup. We consider a multi-armed bandit setting with K = 6 arms that have Bernoulli rewards with bias (0.44, 0.47, 0.5, 0.53, 0.56, 0.59), we run the algorithms for T = 45000 iterations and we execute both algorithms 20 different times. For our algorithm, we set its replicability parameter ρ = 0.3. |