Variance Reduction in Stochastic Gradient Langevin Dynamics
Authors: Kumar Avinava Dubey, Sashank J. Reddi, Sinead A. Williamson, Barnabas Poczos, Alexander J. Smola, Eric P. Xing
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This is complemented by impressive empirical results obtained on a variety of real world datasets, and on four different machine learning tasks (regression, classification, independent component analysis and mixture modeling). |
| Researcher Affiliation | Academia | Avinava Dubey , Sashank J. Reddi , Barnab as P oczos, Alexander J. Smola, Eric P. Xing Department of Machine Learning Carnegie-Mellon University Pittsburgh, PA 15213 {akdubey, sjakkamr, bapoczos, alex, epxing}@cs.cmu.edu Sinead A. Williamson IROM/Statistics and Data Science University of Texas at Austin Austin, TX 78712 sinead.williamson@mccombs.utexas.edu |
| Pseudocode | Yes | Algorithm 1: SAGA-LD |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We ran experiments on 11 standard UCI regression datasets, summarized in Table 1. The datasets can be downloaded from https://archive.ics.uci.edu/ml/index.html. We used a standard ICA dataset for our experiment3...3The dataset can be downloaded from https://www.cis.hut.fi/projects/ica/eegmeg/ MEG_data.html. |
| Dataset Splits | Yes | In each case, we set the prior precision λ = 1, and we partitioned our dataset into training (70%), validation (10%), and test (20%) sets. The validation set is used to select the step size parameters, and we report the mean square error (MSE) evaluated on the test set, using 5-fold cross-validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers). |
| Experiment Setup | Yes | In all our experiments, we use a decreasing step size for SGLD as suggested by [15]. In particular, we use ϵt = a(b + t) γ, where the parameters a, b and γ are chosen for each dataset to give the best performance of the algorithm on that particular dataset. For SAGA-LD, due to the benefit of variance reduction, we use a simple two phase constant step size selection strategy. The minibatch size, n, in both SGLD and SAGA-LD is held at a constant value of 10 throughout our experiments. All algorithms are initialized to the same point and the same sequence of minibatches is pre-generated and used in both algorithms. |