Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Scaling up Data Augmentation MCMC via Calibration

Authors: Leo L. Duan, James E. Johndrow, David B. Dunson

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Dramatic gains in computational eﬃciency are shown in applications. Keywords: Bayesian Probit, Biased subsampling, Big n, Data augmentation, Log-linear model, Logistic regression, Maximal correlation, Polya-Gamma
Researcher Affiliation	Academia	Leo L. Duan EMAIL Department of Statistics University of Florida Gainesville, FL James E. Johndrow EMAIL Department of Statistics Stanford University Stanford, CA David B. Dunson EMAIL Department of Statistical Science Duke University Durham, NC
Pseudocode	No	The paper describes specific algorithms and update rules using mathematical notation, but it does not include any clearly labeled pseudocode blocks or algorithm boxes.
Open Source Code	No	The paper mentions using 'Tensor Flow' for automatic differentiation and optimization, and the 'CODA package in R' for calculating effective sample size, and 'STAN 2.17' for Hamiltonian Monte Carlo, but it does not provide source code for the methodology developed in the paper itself.
Open Datasets	Yes	The dataset is a large sparse network from the Human Connectome Project (Marcus et al., 2011).
Dataset Splits	Yes	We use another co-browsing count table for the same high traﬃc and client sites, collected during a diﬀerent time period. ... Cross-validation root-mean-squared error P i(ˆy i y i )2/n 1/2 between the prediction and actual count y i s is computed.
Hardware Specification	No	The paper provides no specific details about the hardware used for running the experiments, only general statements about 'computing time'.
Software Dependencies	Yes	We run DA for 30,000 steps and CDA for 2,000 steps, so that they have approximately the same eﬀective sample size (calculated with the CODA package in R). ... We ran the ordinary DA algorithm with λ = 1, 000, CDA with λ = 109 and Hamiltonian Monte Carlo with No-U-Turn sampler under the default tuning setting (as implemented in STAN 2.17).
Experiment Setup	Yes	For illustration, we consider a simulation study for probit regression with an intercept and two predictors xi,1, xi,2 No(1, 1), with θ = ( 5, 1, 1) , generating P i yi 20 among n = 10, 000. For illustration, we use a two-parameter intercept-slope model with xi1 iid No(0, 1) and θ = ( 8, 1) . With n = 105, we obtain rare outcome data with P yi 50. We run DA for 30,000 steps and CDA for 2,000 steps, so that they have approximately the same eﬀective sample size (calculated with the CODA package in R). Both algorithms are initialized at the MAP estimates. We ran the ordinary DA algorithm with λ = 1, 000, CDA with λ = 109 and Hamiltonian Monte Carlo with No-U-Turn sampler under the default tuning setting (as implemented in STAN 2.17). All algorithms are initialized at the MAP. We ran DA for 200,000 steps, CDA for 2,000 steps and HMC for 20,000 steps so that they have approximately the same eﬀective sample size. For CDA, we used the ﬁrst 1, 000 steps for adapting r and b.