Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian Monte Carlo

Authors: Mert Gürbüzbalaban, Xuefeng Gao, Yuanhan Hu, Lingjiong Zhu

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate the eﬃciency of our algorithms on decentralized Bayesian linear regression and Bayesian logistic regression problems. Finally, we provide numerical experiments that illustrate our theory and showcase the practical performance of the DE-SGLD and DE-SGHMC algorithms: We show on Bayesian linear regression and Bayesian logistic regression tasks that our method allows each agent to sample from the posterior distribution eﬃciently without communicating local data.
Researcher Affiliation	Academia	Mert Gurbuzbalaban EMAIL Department of Management Science and Information Systems Rutgers Business School Piscataway, NJ 08854, United States of America Xuefeng Gao* EMAIL Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong Shatin, N.T., Hong Kong, China Yuanhan Hu* EMAIL Department of Management Science and Information Systems Rutgers Business School Piscataway, NJ 08854, United States of America Lingjiong Zhu* EMAIL Department of Mathematics Florida State University Tallahassee, FL 32306, United States of America
Pseudocode	No	The paper defines the algorithms (DE-SGLD and DE-SGHMC) through mathematical equations (e.g., equations (2), (32), (33)) but does not include a distinct, structured pseudocode block or algorithm listing.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. It only mentions the license for the paper itself: "License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v22/21-0307.html."
Open Datasets	Yes	In this section, we present our experiments on the Bayesian linear regression problem, where our main goal is to validate Theorems 2 and 12 in a basic setting and show that each agent can sample from the posterior distribution up to an error tolerance with constant stepsize. In this set of experiments, we ﬁrst generate data for each agent by simulating the model: (...) We consider the Bayesian logistic regression problem on the UCI ML Breast Cancer Wisconsin (Diagnostic) data set6 and MAGIC Gamma Telescope data set7. 6. The corresponding data set is available online at https://archive.ics.uci.edu/ml/datasets/Breast+ Cancer+Wisconsin+(Diagnostic). 7. The data set is available at https://archive.ics.uci.edu/ml/datasets/magic+gamma+telescope.
Dataset Splits	Yes	We simulate 5,000 data points and partition them randomly among the N = 100 agents so that each agent will have the same number of data points. (...) For Breast Cancer data set, we separate the data set into N = 6 parts with approximately equal sizes and each agent can access to only one part of the whole data set. (...) For Telescope data set, we separate the data set into training data and test data, where test data has 10% data points.
Hardware Specification	No	The paper does not provide any specific hardware details for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	We tune the stepsize η to the dataset where we take η = 0.009. (...) The stepsize η and the friction coeﬃcient γ are tuned to the dataset where we take η = 0.1 and γ = 7. (...) In Figure 4(b), we used stochastic gradients with batch size b = 25 while we varied the stepsize. (...) Here, the stepsizes are tuned to the dataset where we take η = 0.0003. We use the stochastic gradient with batch size b = 32 in the experiments. (...) where we take η = 0.02 and γ = 30 after tuning these parameters to the dataset. We use the batch size b = 32 in this set of experiments. (...) Here, the stepsizes are chosen as η = 0.0008. We use batch size b = 32 in the experiments. (...) Here, the stepsize η and the friction coeﬃcient γ are well tuned to the data set so that we take η = 0.05, γ = 10. We use batch size b = 32 in the experiments. (...) Here, the stepsizes are chosen as η = 0.008. We use batch size b = 100 in the experiments. (...) Here, the stepsize η and the friction coeﬃcient γ are tuned to the data set where we take η = 0.07, γ = 5. We use batch size b = 100 in the experiments.