The promises and pitfalls of Stochastic Gradient Langevin Dynamics

Authors: Nicolas Brosse, Alain Durmus, Eric Moulines

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our findings are supported by limited numerical experiments.
Researcher Affiliation Academia Nicolas Brosse, Éric Moulines Centre de Mathématiques Appliquées, UMR 7641, Ecole Polytechnique, Palaiseau, France. nicolas.brosse@polytechnique.edu, eric.moulines@polytechnique.edu Alain Durmus Ecole Normale Supérieure CMLA, 61 Av. du Président Wilson 94235 Cachan Cedex, France. alain.durmus@cmla.ens-cachan.fr
Pseudocode No The paper provides mathematical formulations of algorithms (e.g., equations 2, 3, 4, 5) but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets Yes We then illustrate our results on the covertype dataset1 with a Bayesian logistic regression model. The prior is a standard multivariate Gaussian distribution. Given the size of the dataset and the dimension of the problem, LMC requires high computational resources and is not included in the simulations. We truncate the training dataset at N 10^3, 10^4, 10^5 . For all algorithms, the step size γ is set equal to 1/N and the trajectories are started at ˆθ, an estimator of θ , computed using SGD combined with the BFGS algorithm. 1https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/covtype.libsvm. binary.scale.bz2
Dataset Splits No The paper mentions truncating the 'training dataset' and evaluating on the 'test dataset' for the Covertype dataset, but does not provide specific train/validation/test splits (e.g., percentages or sample counts for a validation set) or cross-validation details.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions software like 'Sci Py' [19] and 'Scikit-learn' [28] but does not specify version numbers for these or other key software components used in the experiments.
Experiment Setup Yes For the LMC, SGLDFP, SGLD and SGD algorithms, the step size γ is set equal to (1 + δ/4) 1 where δ is the largest eigenvalue of XTX. We start the algorithms at θ0 = ˆθ and run n = 1/γ iterations where the first 10% samples are discarded as a burn-in period.