Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Byzantine-Resilient Decentralized Multi-Armed Bandits
Authors: Jingxuan Zhu, Alec Koppel, Alvaro Velasquez, Ji Liu
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments corroborate the merits of this framework in practice. [...] experimentally demonstrate that the proposed methodology achieves lower regret compared with single-agent (non-cooperative) UCB. [...] 5 Numerical Evaluation: We conduct experiments additional studies are in Appendix G. We consider a four-arm bandit problem whose arm distributions are Bernoulli with mean 0.5, 0.45, 0.4, 0.3, respectively. [...] Figure present sample means and standard shaded deviations over 50 runs. |
| Researcher Affiliation | Collaboration | Jingxuan Zhu EMAIL Zhejiang Lab Alec Koppel EMAIL J.P. Morgan AI Research Alvaro Velasquez EMAIL University of Colorado Boulder Ji Liu EMAIL Stony Brook University |
| Pseudocode | Yes | Algorithm 1: Filter(i, k, t): Consistency and trimmed mean filters of agent i on arm k at time t [...] Algorithm 2: Resilient Decentralized UCB |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository. The OpenReview link provided is for peer review, not code. |
| Open Datasets | No | We consider a four-arm bandit problem whose arm distributions are Bernoulli with mean 0.5, 0.45, 0.4, 0.3, respectively. [...] For the simulation under the above application model, the reward distribution of each arm is set to be a Bernoulli distribution with a randomly generated mean. |
| Dataset Splits | No | The total time T is set as T = 10000. Figure present sample means and standard shaded deviations over 50 runs. The paper describes a simulation environment and number of runs, but not explicit training/test/validation splits of a fixed dataset. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software names with version numbers (e.g., Python, PyTorch, CUDA versions) that would be needed to replicate the experiment. |
| Experiment Setup | Yes | We consider a four-arm bandit problem whose arm distributions are Bernoulli with mean 0.5, 0.45, 0.4, 0.3, respectively. The Byzantine agent broadcasts 0.4, 0.5, 0.4, 0.3 as the corresponding reward information of the four arms to all the normal agents, and sample count n1,k(t) = ni,k(t) to all normal agents j with i being randomly selected in {2, 3, 4, 5}. The graph in our experiments follows a random structure where the probability each directed edge is activated is a common value q. The total time T is set as T = 10000. [...] We set Κi to a uniform value of 1.3, N = 10, f = 2, and q = 0.8. |