Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Journal of Machine Learning Research (JMLR) - 2019

Documentation Rate of Empirical Papers by Reproducibility Variable

Distribution of Empirical Papers by Number of Documented Variables

Website:

Venue	Year	Papers	Reproducibility Score Reproducibility Score based on Gundersen et al. (2025). See Methods for details.	Documentation Score Documentation Score is the average score over the seven reproducibility variables for empirical research papers. See Methods for details.	% Empirical Percentage of papers that are empirical research vs theoretical research.	% Industry Percentage of empirical research papers with at least one author from Industry.	Website
JMLR	2019	184	0.4	3.55	83.15%	24.84%

Search Papers

	Pseudocode	Open Source Code	Open Datasets	Dataset Splits	Hardware Specification	Software Dependencies	Experiment Setup
A Bootstrap Method for Error Estimation in Randomized Matrix Multiplication	✅	❌	✅	❌	❌	❌	✅	3
A Kernel Multiple Change-point Algorithm via Model Selection	N/A	N/A	N/A	N/A	N/A	N/A	N/A	0
A New Approach to Laplacian Solvers and Flow Problems	✅	❌	❌	❌	❌	❌	❌	1
A Particle-Based Variational Approach to Bayesian Non-negative Matrix Factorization	✅	✅	✅	✅	❌	❌	✅	5
A Representer Theorem for Deep Kernel Learning	❌	❌	❌	✅	❌	❌	✅	2
A Representer Theorem for Deep Neural Networks	❌	❌	❌	❌	❌	❌	❌	0
A Well-Tempered Landscape for Non-convex Robust Subspace Recovery	✅	✅	❌	❌	❌	❌	✅	3
ADMMBO: Bayesian Optimization with Unknown Constraints using ADMM	✅	✅	✅	❌	✅	❌	✅	5
Accelerated Alternating Projections for Robust Principal Component Analysis	✅	✅	✅	❌	✅	✅	✅	6
Active Learning for Cost-Sensitive Classification	✅	✅	✅	❌	❌	❌	✅	4
Adaptation Based on Generalized Discrepancy	❌	✅	✅	✅	❌	❌	✅	4
Adaptive Geometric Multiscale Approximations for Intrinsically Low-dimensional Data	✅	❌	✅	✅	❌	❌	✅	4
AffectiveTweets: a Weka Package for Analyzing Affect in Tweets	❌	✅	✅	✅	❌	❌	❌	3
All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously	❌	✅	✅	✅	❌	❌	✅	4
An Approach to One-Bit Compressed Sensing Based on Probably Approximately Correct Learning Theory	❌	❌	❌	❌	❌	❌	❌	0
An Efficient Two Step Algorithm for High Dimensional Change Point Regression Models Without Grid Search	✅	❌	✅	❌	✅	❌	✅	4
An asymptotic analysis of distributed nonparametric methods	❌	❌	❌	❌	❌	❌	✅	1
Analysis of Langevin Monte Carlo via Convex Optimization	✅	❌	✅	❌	❌	❌	✅	3
Analysis of spectral clustering algorithms for community detection: the general bipartite setting	✅	✅	❌	❌	❌	❌	✅	3
Approximate Profile Maximum Likelihood	✅	✅	❌	❌	❌	❌	✅	3
Approximation Algorithms for Stochastic Clustering	✅	❌	❌	❌	❌	❌	❌	1
Approximation Hardness for A Class of Sparse Optimization Problems	❌	❌	❌	❌	❌	❌	❌	0
Approximations of the Restless Bandit Problem	✅	❌	❌	❌	❌	❌	❌	1
Automated Scalable Bayesian Inference via Hilbert Coresets	✅	❌	✅	✅	❌	❌	✅	4
Bayesian Combination of Probabilistic Classifiers using Multivariate Normal Mixtures	❌	❌	✅	✅	❌	❌	✅	3
Bayesian Optimization for Policy Search via Online-Offline Experimentation	✅	❌	❌	✅	❌	❌	✅	3
Bayesian Space-Time Partitioning by Sampling and Pruning Spanning Trees	❌	❌	✅	❌	❌	✅	✅	3
Best Arm Identification for Contaminated Bandits	✅	❌	❌	❌	❌	❌	❌	1
Binarsity: a penalization for one-hot encoded features in linear supervised learning	✅	❌	✅	✅	❌	❌	✅	4
Boosted Kernel Ridge Regression: Optimal Learning Rates and Early Stopping	❌	❌	❌	✅	❌	❌	✅	2
Causal Learning via Manifold Regularization	❌	✅	✅	✅	❌	❌	✅	4
Change Surfaces for Expressive Multidimensional Changepoints and Counterfactual Prediction	✅	❌	✅	❌	❌	❌	✅	3
Characterizing the Sample Complexity of Pure Private Learners	❌	❌	❌	❌	❌	❌	❌	0
Collective Matrix Completion	✅	✅	❌	✅	✅	✅	✅	6
Complete Search for Feature Selection in Decision Trees	✅	✅	✅	✅	✅	❌	✅	6
Convergence Guarantees for a Class of Non-convex and Non-smooth Optimization Problems	✅	❌	✅	❌	❌	❌	✅	3
Convergence Rate of a Simulated Annealing Algorithm with Noisy Observations	✅	❌	❌	❌	❌	❌	✅	2
Convergence of Gaussian Belief Propagation Under General Pairwise Factorization: Connecting Gaussian MRF with Pairwise Linear Gaussian Model	❌	❌	✅	❌	❌	❌	✅	2
DBSCAN: Optimal Rates For Density-Based Cluster Estimation	✅	❌	❌	❌	❌	❌	❌	1
DPPy: DPP Sampling with Python	❌	✅	❌	❌	❌	❌	❌	1
DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization	✅	❌	✅	✅	✅	❌	✅	5
DataWig: Missing Value Imputation for Tables	✅	✅	✅	❌	❌	❌	❌	3
Decentralized Dictionary Learning Over Time-Varying Digraphs	✅	❌	✅	❌	✅	✅	✅	5
Decontamination of Mutual Contamination Models	✅	❌	✅	❌	❌	❌	❌	2
Decoupling Sparsity and Smoothness in the Dirichlet Variational Autoencoder Topic Model	❌	❌	✅	✅	❌	❌	✅	3
Deep Exploration via Randomized Value Functions	✅	❌	❌	❌	❌	❌	✅	2
Deep Optimal Stopping	❌	❌	❌	❌	✅	✅	✅	3
Deep Reinforcement Learning for Swarm Systems	❌	✅	❌	❌	✅	❌	✅	3
Delay and Cooperation in Nonstochastic Bandits	✅	❌	❌	❌	❌	❌	❌	1
Dependent relevance determination for smooth and structured sparse regression	✅	✅	✅	✅	❌	❌	✅	5
Determinantal Point Processes for Coresets	✅	✅	✅	❌	✅	❌	✅	5
Determining the Number of Latent Factors in Statistical Multi-Relational Learning	✅	❌	✅	✅	❌	✅	✅	5
Differentiable Game Mechanics	✅	✅	❌	❌	❌	❌	✅	3
Differentiable reservoir computing	❌	❌	❌	❌	❌	❌	❌	0
Distributed Inference for Linear Support Vector Machine	✅	❌	❌	❌	❌	❌	✅	2
Dynamic Pricing in High-dimensions	✅	❌	❌	❌	❌	❌	❌	1
Efficient augmentation and relaxation learning for individualized treatment rules using observational data	❌	✅	❌	✅	❌	❌	✅	3
Embarrassingly Parallel Inference for Gaussian Processes	✅	✅	✅	✅	❌	❌	✅	5
Exact Clustering of Weighted Graphs via Semidefinite Programming	✅	❌	❌	❌	❌	❌	✅	2
Fairness Constraints: A Flexible Approach for Fair Classification	✅	✅	✅	✅	❌	❌	✅	5
Fast Automatic Smoothing for Generalized Additive Models	❌	✅	✅	❌	✅	✅	✅	5
Forward-Backward Selection with Early Dropping	✅	❌	✅	✅	✅	❌	✅	5
Gaussian Processes with Linear Operator Inequality Constraints	✅	✅	❌	❌	✅	❌	✅	4
Generalized Maximum Entropy Estimation	✅	❌	❌	❌	✅	❌	✅	3
Generalized Score Matching for Non-Negative Data	❌	❌	✅	❌	❌	❌	✅	2
Generic Inference in Latent Gaussian Process Models	❌	✅	✅	✅	✅	❌	✅	5
GraSPy: Graph Statistics in Python	❌	✅	✅	❌	❌	✅	✅	4
Graph Reduction with Spectral and Cut Guarantees	✅	✅	✅	❌	✅	❌	✅	5
Graphical Lasso and Thresholding: Equivalence and Closed-form Solutions	✅	❌	✅	❌	✅	✅	✅	5
Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations	❌	✅	✅	❌	❌	❌	✅	3
Hamiltonian Monte Carlo with Energy Conserving Subsampling	✅	❌	✅	✅	❌	❌	✅	4
High-Dimensional Poisson Structural Equation Model Learning via $\ell_1$-Regularized Regression	✅	❌	✅	✅	❌	❌	✅	4
High-dimensional Varying Index Coefficient Models via Stein's Identity	❌	✅	✅	✅	❌	❌	✅	4
Iterated Learning in Dynamic Social Networks	❌	❌	❌	❌	❌	❌	❌	0
Ivanov-Regularised Least-Squares Estimators over Large RKHSs and Their Interpolation Spaces	❌	❌	❌	❌	❌	❌	❌	0
Joint PLDA for Simultaneous Modeling of Two Factors	✅	❌	✅	✅	❌	❌	✅	4
Kernel Approximation Methods for Speech Recognition	✅	❌	✅	✅	❌	❌	✅	4
Kernels for Sequentially Ordered Data	✅	❌	✅	✅	❌	❌	✅	4
Layer-Wise Learning Strategy for Nonparametric Tensor Product Smoothing Spline Regression and Graphical Models	✅	❌	✅	✅	❌	❌	✅	4
Lazifying Conditional Gradient Algorithms	✅	❌	✅	❌	✅	✅	✅	5
Learnability of Solutions to Conjunctive Queries	❌	❌	❌	❌	❌	❌	❌	0
Learning Attribute Patterns in High-Dimensional Structured Latent Attribute Models	✅	❌	✅	❌	✅	❌	✅	4
Learning Optimized Risk Scores	✅	✅	✅	✅	✅	✅	✅	7
Learning Overcomplete, Low Coherence Dictionaries with Linear Inference	❌	✅	✅	❌	✅	❌	✅	4
Learning Representations of Persistence Barcodes	❌	✅	✅	✅	❌	❌	✅	4
Learning Unfaithful $K$-separable Gaussian Graphical Models	✅	❌	❌	❌	❌	❌	❌	1
Learning by Unsupervised Nonlinear Diffusion	✅	❌	❌	❌	❌	❌	✅	2
Learning to Match via Inverse Optimal Transport	✅	❌	✅	✅	❌	❌	✅	4
Local Regularization of Noisy Point Clouds: Improved Global Geometric Estimates and Data Analysis	❌	❌	✅	✅	❌	❌	✅	3
Log-concave sampling: Metropolis-Hastings algorithms are fast	✅	❌	❌	❌	❌	❌	✅	2
Logical Explanations for Deep Relational Machines Using Relevance Information	✅	❌	✅	✅	✅	❌	✅	5
Low Permutation-rank Matrices: Structural Properties and Noisy Completion	❌	❌	❌	❌	❌	❌	❌	0
Matched Bipartite Block Model with Covariates	✅	✅	✅	❌	❌	❌	✅	4
Maximum Likelihood for Gaussian Process Classification and Generalized Linear Mixed Models under Case-Control Sampling	❌	❌	✅	✅	❌	❌	✅	3
Measuring the Effects of Data Parallelism on Neural Network Training	❌	✅	✅	✅	❌	❌	✅	4
Minimal Sample Subspace Learning: Theory and Algorithms	✅	❌	✅	❌	❌	❌	✅	3
Model Selection in Bayesian Neural Networks via Horseshoe Priors	✅	❌	✅	✅	✅	❌	✅	5
Model Selection via the VC Dimension	✅	✅	✅	✅	❌	❌	✅	5
Model-free Nonconvex Matrix Completion: Local Minima Analysis and Applications in Memory-efficient Kernel PCA	❌	❌	✅	❌	✅	❌	✅	3
Monotone Learning with Rectified Wire Networks	✅	✅	✅	✅	✅	❌	✅	6
More Efficient Estimation for Logistic Regression with Optimal Subsamples	✅	❌	✅	✅	✅	✅	✅	6
Morpho-MNIST: Quantitative Assessment and Diagnostics for Representation Learning	❌	✅	✅	✅	❌	❌	✅	4
Multi-class Heterogeneous Domain Adaptation	✅	❌	✅	✅	❌	❌	✅	4
Multi-scale Online Learning: Theory and Applications to Online Auctions and Pricing	✅	❌	❌	❌	❌	❌	❌	1
Multiclass Boosting: Margins, Codewords, Losses, and Algorithms	✅	✅	✅	✅	❌	❌	✅	5
Multiplicative local linear hazard estimation and best one-sided cross-validation	❌	✅	✅	✅	❌	✅	✅	5
Near Optimal Frequent Directions for Sketching Dense and Sparse Matrices	✅	❌	❌	❌	❌	❌	❌	1
Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks	❌	❌	❌	❌	❌	❌	❌	0
NetSDM: Semantic Data Mining with Network Analysis	✅	❌	✅	❌	✅	❌	✅	4
Neural Architecture Search: A Survey	❌	❌	✅	❌	❌	❌	❌	1
Neural Empirical Bayes	❌	❌	✅	❌	❌	❌	✅	2
New Convergence Aspects of Stochastic Gradient Algorithms	✅	❌	✅	❌	❌	❌	✅	3
No-Regret Bayesian Optimization with Unknown Hyperparameters	✅	❌	✅	❌	❌	❌	✅	3
Non-Convex Matrix Completion and Related Problems via Strong Duality	❌	❌	✅	✅	❌	❌	✅	3
Non-Convex Projected Gradient Descent for Generalized Low-Rank Tensor Regression	✅	❌	❌	❌	❌	❌	✅	2
Nonparametric Bayesian Aggregation for Massive Data	❌	❌	❌	❌	❌	❌	✅	1
Nonparametric Estimation of Probability Density Functions of Random Persistence Diagrams	❌	❌	❌	❌	❌	❌	✅	1
Nonuniformity of P-values Can Occur Early in Diverging Dimensions	❌	❌	❌	❌	❌	❌	✅	1
ORCA: A Matlab/Octave Toolbox for Ordinal Regression	❌	✅	✅	✅	❌	❌	❌	3
On Asymptotic and Finite-Time Optimality of Bayesian Predictors	❌	❌	❌	❌	❌	❌	❌	0
On Consistent Vertex Nomination Schemes	❌	❌	❌	❌	❌	❌	❌	0
On the Convergence of Gaussian Belief Propagation with Nodes of Arbitrary Size	✅	❌	✅	❌	❌	❌	✅	3
On the optimality of the Hedge algorithm in the stochastic regime	❌	❌	❌	❌	❌	❌	✅	1
Optimal Convergence Rates for Convex Distributed Optimization in Networks	✅	❌	❌	❌	❌	❌	❌	1
Optimal Policies for Observing Time Series and Related Restless Bandit Problems	❌	❌	❌	❌	❌	❌	✅	1
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers	✅	❌	✅	❌	✅	❌	✅	4
Optimization with Non-Differentiable Constraints with Applications to Fairness, Recall, Churn, and Other Goals	✅	✅	✅	✅	❌	❌	✅	5
Parsimonious Online Learning with Kernels via Sparse Projections in Function Space	✅	❌	✅	✅	❌	❌	✅	4
Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python	❌	✅	❌	❌	✅	✅	✅	4
Prediction Risk for the Horseshoe Regression	❌	❌	✅	✅	❌	❌	✅	3
Provably Accurate Double-Sparse Coding	✅	✅	❌	❌	❌	❌	✅	3
Proximal Distance Algorithms: Theory and Practice	✅	✅	✅	❌	❌	❌	✅	4
PyOD: A Python Toolbox for Scalable Outlier Detection	❌	✅	❌	✅	❌	❌	✅	3
Pyro: Deep Universal Probabilistic Programming	✅	✅	✅	❌	✅	❌	✅	5
Quantification Under Prior Probability Shift: the Ratio Estimator and its Extensions	✅	✅	✅	✅	❌	❌	✅	5
Quantifying Uncertainty in Online Regression Forests	✅	✅	✅	❌	❌	❌	✅	4
Random Feature-based Online Multi-kernel Learning in Environments with Unknown Dynamics	✅	❌	✅	❌	❌	❌	✅	3
Redundancy Techniques for Straggler Mitigation in Distributed Optimization and Learning	✅	❌	✅	✅	✅	❌	✅	5
Regularization via Mass Transportation	❌	✅	✅	✅	✅	✅	✅	6
Relative Error Bound Analysis for Nuclear Norm Regularized Matrix Completion	❌	❌	❌	❌	❌	❌	❌	0
Robust Estimation of Derivatives Using Locally Weighted Least Absolute Deviation Regression	❌	❌	✅	❌	❌	✅	✅	3
Robust Frequent Directions with Application in Online Learning	✅	❌	✅	✅	✅	❌	✅	5
Robustifying Independent Component Analysis by Adjusting for Group-Wise Stationary Noise	✅	✅	✅	✅	❌	❌	✅	5
SMART: An Open Source Data Labeling Platform for Supervised Learning	❌	✅	❌	❌	❌	❌	❌	1
Scalable Approximations for Generalized Linear Problems	✅	❌	✅	✅	❌	❌	✅	4
Scalable Interpretable Multi-Response Regression via SEED	✅	❌	❌	✅	✅	✅	✅	5
Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds	✅	✅	✅	❌	✅	❌	✅	5
Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction	✅	❌	✅	❌	✅	❌	✅	4
Semi-Analytic Resampling in Lasso	❌	✅	✅	✅	✅	❌	✅	5
Shared Subspace Models for Multi-Group Covariance Estimation	✅	✅	✅	❌	✅	❌	✅	5
Sharp Restricted Isometry Bounds for the Inexistence of Spurious Local Minima in Nonconvex Matrix Recovery	✅	❌	❌	❌	❌	✅	✅	3
SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition	❌	✅	✅	❌	✅	✅	❌	4
Simultaneous Phase Retrieval and Blind Deconvolution via Convex Programming	❌	✅	❌	❌	❌	❌	✅	2
Simultaneous Private Learning of Multiple Concepts	✅	❌	❌	❌	❌	❌	❌	1
Smooth neighborhood recommender systems	✅	❌	✅	✅	✅	❌	✅	5
Solving the OSCAR and SLOPE Models Using a Semismooth Newton-Based Augmented Lagrangian Method	✅	❌	✅	❌	✅	❌	✅	4
Sparse Kernel Regression with Coefficient-based $\ell_q-$regularization	❌	❌	❌	❌	❌	❌	❌	0
Spectrum Estimation from a Few Entries	✅	✅	❌	❌	❌	❌	✅	3
Spurious Valleys in One-hidden-layer Neural Network Optimization Landscapes	❌	❌	❌	❌	❌	❌	❌	0
Stochastic Canonical Correlation Analysis	✅	❌	❌	❌	❌	❌	❌	1
Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations	❌	❌	❌	❌	❌	❌	✅	1
Stochastic Variance-Reduced Cubic Regularization Methods	✅	❌	✅	❌	❌	❌	✅	3
Streaming Principal Component Analysis From Incomplete Data	✅	❌	❌	❌	❌	❌	✅	2
TensorLy: Tensor Learning in Python	❌	✅	❌	❌	✅	❌	✅	3
The Common-directions Method for Regularized Empirical Risk Minimization	✅	✅	✅	❌	✅	❌	✅	5
The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks	✅	❌	✅	❌	❌	❌	✅	3
The Relationship Between Agnostic Selective Classification, Active Learning and the Disagreement Coefficient	✅	❌	❌	❌	❌	❌	❌	1
The Sup-norm Perturbation of HOSVD and Low Rank Tensor Denoising	✅	❌	❌	❌	❌	❌	✅	2
Thompson Sampling Guided Stochastic Searching on the Line for Deceptive Environments with Applications to Root-Finding Problems	❌	❌	❌	❌	❌	❌	✅	1
Tight Lower Bounds on the VC-dimension of Geometric Set Systems	❌	❌	❌	❌	❌	❌	❌	0
Time-to-Event Prediction with Neural Networks and Cox Regression	❌	✅	✅	✅	❌	❌	✅	4
Train and Test Tightness of LP Relaxations in Structured Prediction	❌	❌	✅	✅	❌	❌	❌	2
Transport Analysis of Infinitely Deep Neural Network	❌	❌	❌	❌	❌	❌	❌	0
Tunability: Importance of Hyperparameters of Machine Learning Algorithms	❌	✅	✅	✅	❌	❌	✅	4
Two-Layer Feature Reduction for Sparse-Group Lasso via Decomposition of Convex Sets	✅	✅	✅	✅	❌	❌	✅	5
Unsupervised Basis Function Adaptation for Reinforcement Learning	✅	❌	❌	❌	✅	❌	✅	3
Unsupervised Evaluation and Weighted Aggregation of Ranked Classification Predictions	✅	❌	✅	✅	✅	❌	✅	5
Using Simulation to Improve Sample-Efficiency of Bayesian Optimization for Bipedal Robots	❌	❌	❌	❌	❌	❌	✅	1
Utilizing Second Order Information in Minibatch Stochastic Variance Reduced Proximal Iterations	✅	❌	✅	❌	❌	❌	✅	3
Variance-based Regularization with Convex Objectives	✅	✅	✅	✅	❌	❌	❌	4
Why do deep convolutional networks generalize so poorly to small image transformations?	❌	✅	✅	✅	❌	❌	❌	3
iNNvestigate Neural Networks!	✅	✅	✅	❌	❌	❌	❌	3
scikit-multilearn: A Python library for Multi-Label Classification	❌	✅	❌	❌	❌	✅	✅	3
spark-crowd: A Spark Package for Learning from Crowdsourced Big Data	✅	✅	❌	❌	✅	❌	❌	3