reproducibilityindex.ai

The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation

Authors: Zhe Feng, David Parkes, Haifeng Xu

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we provide the results of simulations to validate our theoretical results. ... We run each bandit algorithm for T = 104 rounds, and this forms one trial. We repeat for 100 trials, and report the average results over these trials.
Researcher Affiliation	Academia	1Harvard University, Cambridge, Massachusetts, USA 2University of Virginia, Charlottesville, Virginia, USA.
Pseudocode	No	The paper describes the bandit algorithms (UCB, ε-Greedy, and Thompson Sampling) textually and with mathematical formulas, but does not provide structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that open-source code for the described methodology is available.
Open Datasets	No	The paper uses synthetic data generated from normal distributions with specified means and standard deviation (e.g., 'reward distributions N(µ1, σ2), N(µ2, σ2) and N(µ3, σ2), respectively. We fix µ1 = 5, µ2 = 8, µ3 = 10, and σ = 1'), but does not provide concrete access information to a publicly available or open dataset.
Dataset Splits	No	The paper describes a simulation setup for bandit algorithms with synthetic reward distributions over a fixed number of rounds and trials, but it does not specify any training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiments.
Experiment Setup	Yes	Throughout the simulations, we ﬁx µ1 = 5, µ2 = 8, µ3 = 10, and σ = 1. All the arms use the LSI strategy. We run each bandit algorithm for T = 104 rounds, and this forms one trial. We repeat for 100 trials, and report the average results over these trials. In the ε-Greedy algorithm, we set εt = min{1, 4 t }.