Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo
Authors: Matthew D. Hoffman, Andrew Gelman
JMLR 2014 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we examine the effectiveness of the dual averaging algorithm outlined in Section 3.2, examine what values of the target δ in the dual averaging algorithm yield efficient samplers, and compare the efficiency of NUTS and HMC. For each target distribution, we ran HMC (as implemented in algorithm 5) and NUTS (as implemented in algorithm 6) with four target distributions for 2000 iterations, allowing the step size ϵ to adapt via the dual averaging updates described in Section 3.2 for the first 1000 iterations. |
| Researcher Affiliation | Collaboration | Matthew D. Hoffman EMAIL Adobe Research 601 Townsend St. San Francisco, CA 94110, USA Andrew Gelman EMAIL Departments of Statistics and Political Science Columbia University New York, NY 10027, USA |
| Pseudocode | Yes | Pseudocode implementing a efficient version of NUTS is provided in Algorithm 3. A detailed derivation follows below, along with a simplified version of the algorithm that motivates and builds intuition about Algorithm 3 (but uses much more memory and makes smaller jumps). |
| Open Source Code | Yes | Our algorithm has been implemented in C++ as part of the new open-source Bayesian inference package, Stan (Stan Development Team, 2013). Matlab code implementing the algorithms, along with Stan code for models used in our simulation study, are also available at http://www.cs.princeton.edu/~mdhoffma/. |
| Open Datasets | Yes | In these experiments the target distribution is the posterior of a Bayesian logistic regression model fit to the German credit data set available from the UCI repository (Frank and Asuncion, 2010). |
| Dataset Splits | No | The paper describes the datasets used (e.g., 'German credit data set', 'S&P 500 index') and mentions the number of data points (e.g., '1000 customers', '3000 days of returns') but does not provide specific details on how these datasets were split into training, validation, or test sets for model development or evaluation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or processor types used for running the experiments. |
| Software Dependencies | Yes | Our algorithm has been implemented in C++ as part of the new open-source Bayesian inference package, Stan (Stan Development Team, 2013). Matlab code implementing the algorithms, along with Stan code for models used in our simulation study, are also available at http://www.cs.princeton.edu/~mdhoffma/. |
| Experiment Setup | Yes | In all experiments the dual averaging parameters were set to γ = 0.05, t0 = 10, and κ = 0.75. We evaluated HMC with 10 logarithmically spaced target simulation lengths λ per target distribution... We tried 15 evenly spaced values of the dual averaging target δ between 0.25 and 0.95 for NUTS and 8 evenly spaced values of the dual averaging target δ between 0.25 and 0.95 for HMC. For each sampler-simulation length-δ-target distribution combination we ran 10 iterations with different random seeds. |