Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Variance Reduction in Stochastic Gradient Langevin Dynamics
Authors: Kumar Avinava Dubey, Sashank J. Reddi, Sinead A. Williamson, Barnabas Poczos, Alexander J. Smola, Eric P. Xing
NeurIPS 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This is complemented by impressive empirical results obtained on a variety of real world datasets, and on four different machine learning tasks (regression, classification, independent component analysis and mixture modeling). |
| Researcher Affiliation | Academia | Avinava Dubey , Sashank J. Reddi , Barnab as P oczos, Alexander J. Smola, Eric P. Xing Department of Machine Learning Carnegie-Mellon University Pittsburgh, PA 15213 EMAIL Sinead A. Williamson IROM/Statistics and Data Science University of Texas at Austin Austin, TX 78712 EMAIL |
| Pseudocode | Yes | Algorithm 1: SAGA-LD |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We ran experiments on 11 standard UCI regression datasets, summarized in Table 1. The datasets can be downloaded from https://archive.ics.uci.edu/ml/index.html. We used a standard ICA dataset for our experiment3...3The dataset can be downloaded from https://www.cis.hut.fi/projects/ica/eegmeg/ MEG_data.html. |
| Dataset Splits | Yes | In each case, we set the prior precision λ = 1, and we partitioned our dataset into training (70%), validation (10%), and test (20%) sets. The validation set is used to select the step size parameters, and we report the mean square error (MSE) evaluated on the test set, using 5-fold cross-validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers). |
| Experiment Setup | Yes | In all our experiments, we use a decreasing step size for SGLD as suggested by [15]. In particular, we use ϵt = a(b + t) γ, where the parameters a, b and γ are chosen for each dataset to give the best performance of the algorithm on that particular dataset. For SAGA-LD, due to the benefit of variance reduction, we use a simple two phase constant step size selection strategy. The minibatch size, n, in both SGLD and SAGA-LD is held at a constant value of 10 throughout our experiments. All algorithms are initialized to the same point and the same sequence of minibatches is pre-generated and used in both algorithms. |