Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo

Authors: Matthew D. Hoffman, Andrew Gelman

JMLR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we examine the eﬀectiveness of the dual averaging algorithm outlined in Section 3.2, examine what values of the target δ in the dual averaging algorithm yield eﬃcient samplers, and compare the eﬃciency of NUTS and HMC. For each target distribution, we ran HMC (as implemented in algorithm 5) and NUTS (as implemented in algorithm 6) with four target distributions for 2000 iterations, allowing the step size ϵ to adapt via the dual averaging updates described in Section 3.2 for the ﬁrst 1000 iterations.
Researcher Affiliation	Collaboration	Matthew D. Hoﬀman EMAIL Adobe Research 601 Townsend St. San Francisco, CA 94110, USA Andrew Gelman EMAIL Departments of Statistics and Political Science Columbia University New York, NY 10027, USA
Pseudocode	Yes	Pseudocode implementing a eﬃcient version of NUTS is provided in Algorithm 3. A detailed derivation follows below, along with a simpliﬁed version of the algorithm that motivates and builds intuition about Algorithm 3 (but uses much more memory and makes smaller jumps).
Open Source Code	Yes	Our algorithm has been implemented in C++ as part of the new open-source Bayesian inference package, Stan (Stan Development Team, 2013). Matlab code implementing the algorithms, along with Stan code for models used in our simulation study, are also available at http://www.cs.princeton.edu/~mdhoffma/.
Open Datasets	Yes	In these experiments the target distribution is the posterior of a Bayesian logistic regression model ﬁt to the German credit data set available from the UCI repository (Frank and Asuncion, 2010).
Dataset Splits	No	The paper describes the datasets used (e.g., 'German credit data set', 'S&P 500 index') and mentions the number of data points (e.g., '1000 customers', '3000 days of returns') but does not provide specific details on how these datasets were split into training, validation, or test sets for model development or evaluation.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models or processor types used for running the experiments.
Software Dependencies	Yes	Our algorithm has been implemented in C++ as part of the new open-source Bayesian inference package, Stan (Stan Development Team, 2013). Matlab code implementing the algorithms, along with Stan code for models used in our simulation study, are also available at http://www.cs.princeton.edu/~mdhoffma/.
Experiment Setup	Yes	In all experiments the dual averaging parameters were set to γ = 0.05, t0 = 10, and κ = 0.75. We evaluated HMC with 10 logarithmically spaced target simulation lengths λ per target distribution... We tried 15 evenly spaced values of the dual averaging target δ between 0.25 and 0.95 for NUTS and 8 evenly spaced values of the dual averaging target δ between 0.25 and 0.95 for HMC. For each sampler-simulation length-δ-target distribution combination we ran 10 iterations with diﬀerent random seeds.