Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Cooperative Online Learning with Feedback Graphs

Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Riccardo Della Vecchia

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on synthetic data confirm our theoretical findings.
Researcher Affiliation Academia Nicolò Cesa-Bianchi EMAIL Politecnico di Milano & Università degli Studi di Milano, Milano, Italy Tommaso Cesari EMAIL School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada Riccardo Della Vecchia EMAIL Inria, Université de Lille, CNRS, Centrale Lille, UMR 9189 CRISt AL
Pseudocode Yes Algorithm 1: Exp3-α2 (Locally run by each agent v A)
Open Source Code Yes Our code available at Della Vecchia (2024). Riccardo Della Vecchia. Cooperative online learning with feedback graphs. https://github.com/riccardodv/COOP-learning, 2024.
Open Datasets No Experiments on synthetic data confirm our theoretical findings. In our experiments, we fix the time horizon (T = 10,000), the number of arms (K = 20), and the number of agents (A = 20). ... The feedback graph F and the communication graph N are Erdős Rényi random graphs of parameters p N, p F {0.2, 0.8}.
Dataset Splits No The paper uses synthetic data generated based on specified parameters (time horizon, number of arms, agents, graph types, etc.) but does not describe traditional dataset splits (e.g., train/test/validation percentages or counts) for a pre-existing dataset. It mentions 20 repetitions of each experiment for statistical averaging.
Hardware Specification Yes Experiments were run on a local cluster of CPUs (Intel Xeon E5-2623 v3, 3.00GHz), parallelizing the code over four cores.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes In our experiments, we fix the time horizon (T = 10,000), the number of arms (K = 20), and the number of agents (A = 20). We also set the delay δN to 1. The loss of each action is a Bernoulli random variable of parameter 1/2, except for the optimal action which has parameter 1/2 p K/T. The activation probabilities q(v) are the same for all agents v A, and range in the set {0.05, 0.5, 1}. This implies that Q {1, 10, 20}. The feedback graph F and the communication graph N are Erdős Rényi random graphs of parameters p N, p F {0.2, 0.8}. For each choice of the parameters, the same realization of N and F was kept fixed in all the experiments, see Figure 1.