Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Replicable Bandits
Authors: Hossein Esfandiari, Alkis Kalavasis, Amin Karbasi, Andreas Krause, Vahab Mirrokni, Grigoris Velegkas
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we provide some experimental evaluation of our proposed algorithms in the setting of multi-armed stochastic bandits. |
| Researcher Affiliation | Collaboration | Hossein Esfandiari Google Research EMAIL Alkis Kalavasis National Technical University of Athens EMAIL Amin Karbasi Yale University, Google Research EMAIL Andreas Krause ETH Zurich EMAIL Vahab Mirrokni Google Research EMAIL Grigoris Velegkas Yale University EMAIL |
| Pseudocode | Yes | Algorithm 1 Mean-Estimation Based Replicable Algorithm for Stochastic MAB (Theorem 3) |
| Open Source Code | No | No statement explicitly providing access to source code for the methodology, nor a link to a repository. |
| Open Datasets | No | We consider a multi-armed bandit setting with K = 6 arms that have Bernoulli rewards with bias (0.44, 0.47, 0.5, 0.53, 0.56, 0.59) |
| Dataset Splits | No | The paper describes a simulated multi-armed bandit environment over T rounds, but does not specify training/validation/test dataset splits from an existing dataset. |
| Hardware Specification | No | No specific hardware (e.g., GPU models, CPU types, or memory) used for experiments is mentioned. |
| Software Dependencies | No | No specific software dependencies with version numbers are provided. |
| Experiment Setup | Yes | Experimental Setup. We consider a multi-armed bandit setting with K = 6 arms that have Bernoulli rewards with bias (0.44, 0.47, 0.5, 0.53, 0.56, 0.59), we run the algorithms for T = 45000 iterations and we execute both algorithms 20 different times. For our algorithm, we set its replicability parameter ρ = 0.3. |