Stochastic Gradient Hamiltonian Monte Carlo
Authors: Tianqi Chen, Emily Fox, Carlos Guestrin
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A number of simulated experiments validate our theoretical results and demonstrate the differences between (i) exact HMC, (ii) the na ıve implementation of stochastic gradient HMC (simply replacing the gradient with a stochastic gradient), and (iii) our proposed method incorporating friction. We also compare to the first-order Langevin dynamics of SGLD. Finally, we apply our proposed methods to a classification task using Bayesian neural networks and to online Bayesian matrix factorization of a standard movie dataset. Our experimental results demonstrate the effectiveness of the proposed algorithm. |
| Researcher Affiliation | Academia | Tianqi Chen TQCHEN@CS.WASHINGTON.EDU Emily B. Fox EBFOX@STAT.WASHINGTON.EDU Carlos Guestrin GUESTRIN@CS.WASHINGTON.EDU MODE Lab, University of Washington, Seattle, WA. |
| Pseudocode | Yes | Algorithm 1: Hamiltonian Monte Carlo |
| Open Source Code | No | The paper does not provide an unambiguous statement about releasing source code for the methodology described in this paper, nor does it provide a specific repository link. |
| Open Datasets | Yes | We also test our method on a handwritten digits classification task using the MNIST dataset. and We conduct an experiment in online Bayesian PMF on the Movielens dataset ml-1M. with footnote 5: http://grouplens.org/datasets/movielens/ |
| Dataset Splits | Yes | We randomly split a validation set containing 10,000 instances from the training data in order to select training parameters, and use the remaining 50,000 instances for training. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | For the sampling-based methods, we take a fully Bayesian approach and place a weakly informative gamma prior on each layer s weight regularizer λ. The sampling procedure is carried out by running SGHMC and SGLD using minibatches of 500 training instances, then resampling hyperparameters after an entire pass over the training set. We run the samplers for 800 iterations (each over the entire training dataset) and discard the initial 50 samples as burn-in. |