Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Byzantine-Resilient Decentralized Multi-Armed Bandits

Authors: Jingxuan Zhu, Alec Koppel, Alvaro Velasquez, Ji Liu

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments corroborate the merits of this framework in practice. [...] experimentally demonstrate that the proposed methodology achieves lower regret compared with single-agent (non-cooperative) UCB. [...] 5 Numerical Evaluation: We conduct experiments additional studies are in Appendix G. We consider a four-arm bandit problem whose arm distributions are Bernoulli with mean 0.5, 0.45, 0.4, 0.3, respectively. [...] Figure present sample means and standard shaded deviations over 50 runs.
Researcher Affiliation Collaboration Jingxuan Zhu EMAIL Zhejiang Lab Alec Koppel EMAIL J.P. Morgan AI Research Alvaro Velasquez EMAIL University of Colorado Boulder Ji Liu EMAIL Stony Brook University
Pseudocode Yes Algorithm 1: Filter(i, k, t): Consistency and trimmed mean filters of agent i on arm k at time t [...] Algorithm 2: Resilient Decentralized UCB
Open Source Code No The paper does not contain any explicit statement about releasing source code or provide a link to a code repository. The OpenReview link provided is for peer review, not code.
Open Datasets No We consider a four-arm bandit problem whose arm distributions are Bernoulli with mean 0.5, 0.45, 0.4, 0.3, respectively. [...] For the simulation under the above application model, the reward distribution of each arm is set to be a Bernoulli distribution with a randomly generated mean.
Dataset Splits No The total time T is set as T = 10000. Figure present sample means and standard shaded deviations over 50 runs. The paper describes a simulation environment and number of runs, but not explicit training/test/validation splits of a fixed dataset.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper does not specify any software names with version numbers (e.g., Python, PyTorch, CUDA versions) that would be needed to replicate the experiment.
Experiment Setup Yes We consider a four-arm bandit problem whose arm distributions are Bernoulli with mean 0.5, 0.45, 0.4, 0.3, respectively. The Byzantine agent broadcasts 0.4, 0.5, 0.4, 0.3 as the corresponding reward information of the four arms to all the normal agents, and sample count n1,k(t) = ni,k(t) to all normal agents j with i being randomly selected in {2, 3, 4, 5}. The graph in our experiments follows a random structure where the probability each directed edge is activated is a common value q. The total time T is set as T = 10000. [...] We set Κi to a uniform value of 1.3, N = 10, f = 2, and q = 0.8.