Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Harnessing the Power of Federated Learning in Federated Contextual Bandits

Authors: Chengshuai Shi, Ruida Zhou, Kun Yang, Cong Shen

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We substantiate these claims through rigorous theoretical analyses and empirical evaluations. ... Experimental results using real-world data with several different FL choices corroborate the practicability and flexibility of Fed IGW. ... In this section, we report the empirical performances of Fed IGW on two distinct real-world multi-label classification datasets, Bibtex (Katakis et al., 2008) and Delicious (Tsoumakas et al., 2008), which are also used in other practical CB investigations such as Cortes (2018).
Researcher Affiliation	Academia	Chengshuai Shi EMAIL Department of Electrical and Computer Engineering University of Virginia Ruida Zhou EMAIL Department of Electrical and Computer Engineering University of California, Los Angeles Kun Yang EMAIL Department of Electrical and Computer Engineering University of Virginia Cong Shen EMAIL Department of Electrical and Computer Engineering University of Virginia
Pseudocode	Yes	Algorithm 1 Fed IGW (Agent m) ... Algorithm 2 The FL component commonly adopted in existing studies on federated linear bandits: one-shot aggregation of compressed local data ... Algorithm 3 The (simplified) Fed Avg algorithm as an example of the canonical FL framework: multi-round aggregation of local model parameters
Open Source Code	Yes	Additional experimental details and results are discussed in Appendix G, while the codes for the experiments can be found at https://github.com/Shen Group/Fed IGW.
Open Datasets	Yes	In this section, we report the empirical performances of Fed IGW on two distinct real-world multi-label classification datasets, Bibtex (Katakis et al., 2008) and Delicious (Tsoumakas et al., 2008), which are also used in other practical CB investigations such as Cortes (2018).
Dataset Splits	No	at each time step, a context is randomly sampled from the dataset while the true labels are concealed from the agents.
Hardware Specification	No	The paper mentions two-layer multi-layer perceptrons (MLPs) are used to approximate reward functions but provides no specific details about the hardware used to run the experiments, such as CPU or GPU models.
Software Dependencies	No	The paper states that codes for the experiments are available but does not provide specific version numbers for any software libraries, programming languages, or other dependencies in the text.
Experiment Setup	Yes	For practical conveniences, instead of selecting a theoretically sound but sophisticated choice of γ for Fed IGW as in Theorem 4.1, we set it as a constant hyper-parameter and perform some preliminary manual selections with the final adopted values reported in Table 5. We believe this approach is more practically appealing as it does not need to scale γ consistently; a similar choice of using constant γ s is also adopted in Agarwal et al. (2023). Also, the temperature parameter ζ used in softmax can be found in Table 5. ... During each FL process, the local batch size, the number of communications, and the local learning rate are specified in Table 5. Moreover, the epoch length is designed to be growing exponentially as in Corollaries 4.2, D.8 and E.2, i.e., τ l = 2l, while culminating at an upper limit of 4096 to maintain timely updates.