reproducibilityindex.ai

Distributed Linear Bandits under Communication Constraints

Authors: Sudeep Salgia, Qing Zhao

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we provide empirical evidence that corroborates our theoretical findings. We compare our proposed PLS algorithm with three popular distributed linear bandit algorithms, namely, Distributed Elimination for Linear Bandits (DELB) (Wang et al., 2019), Federated Phased Elimination (Fed-PE) (Huang et al., 2021) and Distributed Batch Elimination Linear Upper Confidence Bound (Dis BE-LUCB) (Amani et al., 2022).
Researcher Affiliation	Academia	Sudeep Salgia 1 Qing Zhao 1 1Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY, USA. Correspondence to: Sudeep Salgia <ss3827@cornell.edu>.
Pseudocode	Yes	The pseudo code for the norm estimation stage is given in Algorithms 1 and 2.
Open Source Code	No	The information is insufficient. The paper does not explicitly state that source code for its methodology is available or provide a link to a repository.
Open Datasets	No	The information is insufficient. The paper describes synthetic data generation for its experiments but does not provide access information or citations for any publicly available or open dataset.
Dataset Splits	No	The information is insufficient. The paper conducts simulations for a bandit problem and does not describe training, validation, or test dataset splits in the conventional sense. It mentions a
Hardware Specification	No	The information is insufficient. The paper describes the experimental setup and parameters but does not provide any specific hardware details (e.g., GPU/CPU models, memory, cloud instances) used for running the simulations.
Software Dependencies	No	The information is insufficient. The paper does not specify any software dependencies, libraries, or their version numbers required for reproducibility of the experiments.
Experiment Setup	Yes	We consider a distributed linear bandit instance with d = 20, M = 10 agents which is run for a time horizon of T = 10^6 steps. The underlying mean reward vector is drawn uniformly from the surface of a unit ball. The rewards are corrupted with a zero mean Gaussian with unit variance.