Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Journal of Machine Learning Research (JMLR) - 2021

Documentation Rate of Empirical Papers by Reproducibility Variable

Distribution of Empirical Papers by Number of Documented Variables

Website:

Venue Year Papers
Reproducibility Score Reproducibility Score based on Gundersen et al. (2025). See Methods for details.
Documentation Score Documentation Score is the average score over the seven reproducibility variables for empirical research papers. See Methods for details.
% Empirical Percentage of papers that are empirical research vs theoretical research.
% Industry Percentage of empirical research papers with at least one author from Industry.
Website
JMLR 2021 290 0.46 3.74 85.52% 21.77%
Pseudocode
Open Source Code
Open Datasets
Dataset Splits
Hardware Specification
Software Dependencies
Experiment Setup
A Bayes-Optimal View on Adversarial Examples 3
A Bayesian Contiguous Partitioning Method for Learning Clustered Latent Variables 4
A Contextual Bandit Bake-off 4
A Distributed Method for Fitting Laplacian Regularized Stratified Models 6
A Fast Globally Linearly Convergent Algorithm for the Computation of Wasserstein Barycenters 5
A General Framework for Adversarial Label Learning 4
A General Framework for Empirical Bayes Estimation in Discrete Linear Exponential Family 4
A Generalised Linear Model Framework for β-Variational Autoencoders based on Exponential Dispersion Families 4
A Greedy Algorithm for Quantizing Neural Networks 5
A Lyapunov Analysis of Accelerated Methods in Optimization 1
A Probabilistic Interpretation of Self-Paced Learning with Applications to Reinforcement Learning 5
A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms 0
A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration 2
A Theory of the Risk for Optimization with Relaxation and its Application to Support Vector Machines 2
A Two-Level Decomposition Framework Exploiting First and Second Order Information for SVM Training Problems 5
A Unified Analysis of First-Order Methods for Smooth Games via Integral Quadratic Constraints 3
A Unified Convergence Analysis for Shuffling-Type Gradient Methods 4
A Unified Framework for Random Forest Prediction Error Estimation 6
A Unified Framework for Spectral Clustering in Sparse Graphs 5
A Unified Sample Selection Framework for Output Noise Filtering: An Error-Bound Perspective 4
A flexible model-free prediction-based framework for feature ranking 5
A general linear-time inference method for Gaussian Processes on one dimension 6
Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent 4
Achieving Fairness in the Stochastic Multi-Armed Bandit Problem 2
Adaptive estimation of nonparametric functionals 0
Adversarial Monte Carlo Meta-Learning of Optimal Prediction Procedures 7
Aggregated Hold-Out 2
Alibi Explain: Algorithms for Explaining Machine Learning Models 2
An Empirical Study of Bayesian Optimization: Acquisition Versus Partition 5
An Importance Weighted Feature Selection Stability Measure 4
An Inertial Newton Algorithm for Deep Learning 6
An Online Sequential Test for Qualitative Treatment Effects 4
An algorithmic view of L2 regularization and some path-following algorithms 1
Analysis of high-dimensional Continuous Time Markov Chains using the Local Bouncy Particle Sampler 6
Analyzing the discrepancy principle for kernelized spectral filter learning algorithms 1
Approximate Newton Methods 3
Are We Forgetting about Compositional Optimisers in Bayesian Optimisation? 4
As You Like It: Localization via Paired Comparisons 1
Asymptotic Normality, Concentration, and Coverage of Generalized Posteriors 0
Asynchronous Online Testing of Multiple Hypotheses 4
Attention is Turing-Complete 0
Banach Space Representer Theorems for Neural Networks and Ridge Splines 0
Bandit Convex Optimization in Non-stationary Environments 3
Bandit Learning in Decentralized Matching Markets 2
Batch greedy maximization of non-submodular functions: Guarantees and applications to experimental design 2
Bayesian Distance Clustering 4
Bayesian Text Classification and Summarization via A Class-Specified Topic Model 4
Bayesian time-aligned factor analysis of paired multivariate time series 2
Benchmarking Unsupervised Object Representations for Video Sequences 5
Beyond English-Centric Multilingual Machine Translation 5
Bifurcation Spiking Neural Network 3
Black-Box Reductions for Zeroth-Order Gradient Algorithms to Achieve Lower Query Complexity 3
CAT: Compression-Aware Training for bandwidth reduction 3
COKE: Communication-Censored Decentralized Kernel Learning 4
ChainerRL: A Deep Reinforcement Learning Library 5
Classification vs regression in overparameterized regimes: Does the loss function matter? 1
Collusion Detection and Ground Truth Inference in Crowdsourcing for Labeling Tasks 4
Communication-Efficient Distributed Covariance Sketch, with Application to Distributed PCA 2
Conditional independences and causal relations implied by sets of equations 1
Consensus-Based Optimization on the Sphere: Convergence to Global Minimizers and Machine Learning 4
Consistency of Gaussian Process Regression in Metric Spaces 0
Consistent Semi-Supervised Graph Regularization for High Dimensional Data 4
Consistent estimation of small masses in feature sampling 0
Context-dependent Networks in Multivariate Time Series: Models, Methods, and Risk Bounds in High Dimensions 4
Continuous Time Analysis of Momentum Methods 2
Contrastive Estimation Reveals Topic Posterior Information to Linear Models 4
Convergence Guarantees for Gaussian Process Means With Misspecified Likelihoods and Smoothness 0
Convex Clustering: Model, Theoretical Guarantee and Efficient Algorithm 4
Convex Geometry and Duality of Over-parameterized Neural Networks 3
Convolutional Neural Networks Are Not Invariant to Translation, but They Can Learn to Be 2
Cooperative SGD: A Unified Framework for the Design and Analysis of Local-Update SGD Algorithms 4
Counterfactual Mean Embeddings 5
DIG: A Turnkey Library for Diving into Graph Deep Learning Research 2
DeEPCA: Decentralized Exact PCA with Linear Convergence Rate 3
Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian Monte Carlo 3
Determining the Number of Communities in Degree-corrected Stochastic Block Models 3
Differentially Private Regression and Classification with Sparse Gaussian Processes 4
Domain Generalization by Marginal Transfer Learning 5
Domain adaptation under structural causal models 4
Double Generative Adversarial Networks for Conditional Independence Testing 6
Doubly infinite residual neural networks: a diffusion process approach 3
Dynamic Tensor Recommender Systems 5
Edge Sampling Using Local Network Information 3
Empirical Bayes Matrix Factorization 6
Entangled Kernels - Beyond Separability 5
Estimating Uncertainty Intervals from Collaborating Networks 7
Estimating the Lasso's Effective Noise 3
Estimation and Inference for High Dimensional Generalized Linear Models: A Splitting and Smoothing Approach 6
Estimation and Optimization of Composite Outcomes 5
Exact Asymptotics for Linear Quadratic Adaptive Control 3
Expanding Boundaries of Gap Safe Screening 4
Explaining Explanations: Axiomatic Feature Interactions for Deep Networks 3
Explaining by Removing: A Unified Framework for Model Explanation 3
FATE: An Industrial Grade Platform for Collaborative Learning With Data Protection 3
FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference 7
Factorization Machines with Regularization for Sparse Feature Interactions 6
Failures of Model-dependent Generalization Bounds for Least-norm Interpolation 0
Fast Learning for Renewal Optimization in Online Task Scheduling 1
Finite Time LTI System Identification 2
Finite-sample Analysis of Interpolating Linear Classifiers in the Overparameterized Regime 0
First-order Convergence Theory for Weakly-Convex-Weakly-Concave Min-max Problems 3
Flexible Signal Denoising via Flexible Empirical Bayes Shrinkage 5
From Fourier to Koopman: Spectral Methods for Long-term Time Series Prediction 5
From Low Probability to High Confidence in Stochastic Convex Optimization 1
Further results on latent discourse models and word embeddings 2
GIBBON: General-purpose Information-Based Bayesian Optimisation 6
Gaussian Approximation for Bias Reduction in Q-Learning 5
GemBag: Group Estimation of Multiple Bayesian Graphical Models 5
Generalization Performance of Multi-pass Stochastic Gradient Descent with Convex Loss Functions 0
Generalization Properties of hyper-RKHS and its Applications 6
Geometric structure of graph Laplacian embeddings 1
Global and Quadratic Convergence of Newton Hard-Thresholding Pursuit 6
Gradient Methods Never Overfit On Separable Data 3
Graph Matching with Partially-Correct Seeds 7
Guided Visual Exploration of Relations in Data Sets 6
Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls 5
Hardness of Identity Testing for Restricted Boltzmann Machines and Potts models 0
High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm 1
Histogram Transform Ensembles for Large-scale Regression 4
Hoeffding's Inequality for General Markov Chains and Its Applications to Statistical Learning 0
Homogeneity Structure Learning in Large-scale Panel Data with Heavy-tailed Errors 5
How Well Generative Adversarial Networks Learn Distributions 0
How to Gain on Power: Novel Conditional Independence Tests Based on Short Expansion of Conditional Mutual Information 5
Hybrid Predictive Models: When an Interpretable Model Collaborates with a Black-box Model 5
Hyperparameter Optimization via Sequential Uniform Designs 5
Implicit Langevin Algorithms for Sampling From Log-concave Densities 3
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning 5
Improved Shrinkage Prediction under a Spiked Covariance Structure 4
Improving Reproducibility in Machine Learning Research(A Report from the NeurIPS 2019 Reproducibility Program) 0
Incorporating Unlabeled Data into Distributionally Robust Learning 4
Individual Fairness in Hindsight 1
Inference In High-dimensional Single-Index Models Under Symmetric Designs 4
Inference for Multiple Heterogeneous Networks with a Common Invariant Subspace 4
Inference for the Case Probability in High-dimensional Logistic Regression 4
Information criteria for non-normalized models 3
Integrated Principal Components Analysis 5
Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data 5
Integrative High Dimensional Multiple Testing with Heterogeneity under Data Sharing Constraints 3
Interpretable Deep Generative Recommendation Models 4
Is SGD a Bayesian sampler? Well, almost 5
Kernel Operations on the GPU, with Autodiff, without Memory Overflows 3
Kernel Smoothing, Mean Shift, and Their Learning Theory with Directional Data 4
Knowing what You Know: valid and validated confidence sets in multiclass and multilabel prediction 4
L-SVRG and L-Katyusha with Arbitrary Sampling 4
LDLE: Low Distortion Local Eigenmaps 5
Langevin Dynamics for Adaptive Inverse Reinforcement Learning of Stochastic Gradient Algorithms 4
Langevin Monte Carlo: random coordinate descent and variance reduction 2
LassoNet: A Neural Network with Feature Sparsity 6
Learning Bayesian Networks from Ordinal Data 5
Learning Laplacian Matrix from Graph Signals with Sparse Spectral Representation 6
Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives 7
Learning Strategies in Decentralized Matching Markets under Uncertain Preferences 4
Learning Whenever Learning is Possible: Universal Learning under General Stochastic Processes 0
Learning a High-dimensional Linear Structural Equation Model via l1-Regularized Regression 3
Learning and Planning for Time-Varying MDPs Using Maximum Likelihood Estimation 1
Learning interaction kernels in heterogeneous systems of agents from multiple trajectories 5
Learning partial correlation graphs and graphical models by covariance queries 1
Learning with semi-definite programming: statistical bounds based on fixed point analysis and excess risk curvature 2
Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings 3
Linear Bandits on Uniformly Convex Sets 1
LocalGAN: Modeling Local Distributions for Adversarial Response Generation 6
Locally Differentially-Private Randomized Response for Discrete Distribution Learning 1
Locally Private k-Means Clustering 1
Matrix Product States for Inference in Discrete Probabilistic Models 3
MetaGrad: Adaptation using Multiple Learning Rates in Online Learning 4
Method of Contraction-Expansion (MOCE) for Simultaneous Inference in Linear Models 3
Mixing Time of Metropolis-Hastings for Bayesian Community Detection 3
Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals 2
Mode-wise Tensor Decompositions: Multi-dimensional Generalizations of CUR Decompositions 5
Model Linkage Selection for Cooperative Learning 4
Multi-class Gaussian Process Classification with Noisy Inputs 5
Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis 4
Multilevel Monte Carlo Variational Inference 4
MushroomRL: Simplifying Reinforcement Learning Research 2
NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation 3
NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization 5
Neighborhood Structure Assisted Non-negative Matrix Factorization and Its Application in Unsupervised Point-wise Anomaly Detection 3
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks 1
Non-linear, Sparse Dimensionality Reduction via Path Lasso Penalized Autoencoders 5
Non-parametric Quantile Regression via the K-NN Fused Lasso 5
Nonparametric Continuous Sensor Registration 6
Nonparametric Modeling of Higher-Order Interactions via Hypergraphons 4
Normalizing Flows for Probabilistic Modeling and Inference 2
Oblivious Data for Fairness with Kernels 5
On ADMM in Deep Learning: Convergence and Saturation-Avoidance 7
On Multi-Armed Bandit Designs for Dose-Finding Trials 3
On Solving Probabilistic Linear Diophantine Equations 5
On Universal Approximation and Error Bounds for Fourier Neural Operators 1
On efficient multilevel Clustering via Wasserstein distances 5
On lp-hyperparameter Learning via Bilevel Nonsmooth Optimization 6
On the Estimation of Network Complexity: Dimension of Graphons 2
On the Hardness of Robust Classification 0
On the Optimality of Kernel-Embedding Based Goodness-of-Fit Tests 3
On the Riemannian Search for Eigenvector Computation 3
On the Stability Properties and the Optimization Landscape of Training Problems with Squared Loss for Neural Networks and General Nonlinear Conic Approximation Schemes 0
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift 1
One-Shot Federated Learning: Theoretical Limits and Algorithms to Achieve Them 3
Online stochastic gradient descent on non-convex losses from high-dimensional inference 2
OpenML-Python: an extensible Python API for OpenML 3
Optimal Bounds between f-Divergences and Integral Probability Metrics 0
Optimal Feedback Law Recovery by Gradient-Augmented Sparse Polynomial Regression 5
Optimal Minimax Variable Selection for Large-Scale Matrix Linear Regression Model 3
Optimal Rates of Distributed Regression with Imperfect Kernels 1
Optimal Structured Principal Subspace Estimation: Metric Entropy and Minimax Rates 1
Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives 0
Optimized Score Transformation for Consistent Fair Classification 5
POT: Python Optimal Transport 2
Partial Policy Iteration for L1-Robust Markov Decision Processes 6
Particle-Gibbs Sampling for Bayesian Feature Allocation Models 6
Path Length Bounds for Gradient Descent and Flow 1
Pathwise Conditioning of Gaussian Processes 3
PeerReview4All: Fair and Accurate Reviewer Assignment in Peer Review 5
Phase Diagram for Two-layer ReLU Neural Networks at Infinite-width Limit 3
Policy Teaching in Reinforcement Learning via Environment Poisoning Attacks 2
Prediction Under Latent Factor Regression: Adaptive PCR, Interpolating Predictors and Beyond 3
Prediction against a limited adversary 0
Predictive Learning on Hidden Tree-Structured Ising Models 3
Preference-based Online Learning with Dueling Bandits: A Survey 0
Probabilistic Iterative Methods for Linear Systems 3
Projection-free Decentralized Online Learning for Submodular Maximization over Time-Varying Networks 4
Pseudo-Marginal Hamiltonian Monte Carlo 4
PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings 3
Pykg2vec: A Python Library for Knowledge Graph Embedding 3
Quasi-Monte Carlo Quasi-Newton in Variational Bayes 4
ROOTS: Object-Centric Representation and Rendering of 3D Scenes 4
RaSE: Random Subspace Ensemble Classification 5
Ranking and synchronization from pairwise measurements via SVD 3
Refined approachability algorithms and application to regret minimization with global costs 0
Regularized spectral methods for clustering signed networks 3
Regulating Greed Over Time in Multi-Armed Bandits 4
Replica Exchange for Non-Convex Optimization 2
Representer Theorems in Banach Spaces: Minimum Norm Interpolation, Regularized Learning and Semi-Discrete Inverse Problems 0
Reproducing kernel Hilbert C*-module and kernel mean embeddings 2
Residual Energy-Based Models for Text 5
Revisiting Model-Agnostic Private Learning: Faster Rates and Active Learning 4
Risk Bounds for Unsupervised Cross-Domain Mapping with IPMs 3
Risk-Averse Learning by Temporal Difference Methods with Markov Risk Measures 1
River: machine learning for streaming data in Python 3
Safe Policy Iteration: A Monotonically Improving Approximate Policy Iteration Approach 3
Shape-Enforcing Operators for Generic Point and Interval Estimators of Functions 4
Simple and Fast Algorithms for Interactive Machine Learning with Random Counter-examples 1
Simultaneous Change Point Inference and Structure Recovery for High Dimensional Gaussian Graphical Models 5
Single and Multiple Change-Point Detection with Differential Privacy 2
Soft Tensor Regression 3
Some Theoretical Insights into Wasserstein GANs 2
Sparse Convex Optimization via Adaptively Regularized Hard Thresholding 3
Sparse Popularity Adjusted Stochastic Block Model 2
Sparse Tensor Additive Regression 4
Sparse and Smooth Signal Estimation: Convexification of L0-Formulations 6
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks 3
Stable-Baselines3: Reliable Reinforcement Learning Implementations 3
Statistical Guarantees for Local Spectral Clustering on Random Neighborhood Graphs 3
Statistical Query Lower Bounds for Tensor PCA 0
Statistical guarantees for local graph clustering 3
Statistically and Computationally Efficient Change Point Localization in Regression Settings 4
Stochastic Online Optimization using Kalman Recursion 4
Stochastic Proximal AUC Maximization 4
Stochastic Proximal Methods for Non-Smooth Non-Convex Constrained Sparse Optimization 5
Strong Consistency, Graph Laplacians, and the Stochastic Block Model 2
Structure Learning of Undirected Graphical Models for Count Data 5
Subspace Clustering through Sub-Clusters 4
TensorHive: Management of Exclusive GPU Access for Distributed Machine Learning Workloads 3
Testing Conditional Independence via Quantile Regression Based Partial Copulas 3
The Decoupled Extended Kalman Filter for Dynamic Exponential-Family Factorization Models 4
The Ridgelet Prior: A Covariance Function Approach to Prior Specification for Bayesian Neural Networks 3
The ensmallen library for flexible numerical optimization 6
Thompson Sampling Algorithms for Cascading Bandits 3
Tighter Risk Certificates for Neural Networks 5
Towards a Unified Analysis of Random Fourier Features 3
Tractable Approximate Gaussian Inference for Bayesian Neural Networks 3
Transferability of Spectral Graph Convolutional Neural Networks 2
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits 2
Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMap, and PaCMAP for Data Visualization 6
Understanding Recurrent Neural Networks Using Nonequilibrium Response Theory 0
Unfolding-Model-Based Visualization: Theory, Method and Applications 5
Universal consistency and rates of convergence of multiclass prototype algorithms in metric spaces 0
Unlinked Monotone Regression 3
V-statistics and Variance Estimation 5
VariBAD: Variational Bayes-Adaptive Deep RL via Meta-Learning 4
Variance Reduced Median-of-Means Estimator for Byzantine-Robust Distributed Inference 2
Wasserstein barycenters can be computed in polynomial time in fixed dimension 3
Wasserstein distance estimates for the distributions of numerical approximations to ergodic stochastic differential equations 0
What Causes the Test Error? Going Beyond Bias-Variance via ANOVA 4
When Does Gradient Descent with Logistic Loss Find Interpolating Two-Layer Networks? 1
When random initializations help: a study of variational inference for community detection 1
dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python 1
giotto-tda: : A Topological Data Analysis Toolkit for Machine Learning and Data Exploration 2
mlr3pipelines - Flexible Machine Learning Pipelines in R 2
mvlearn: Multiview Machine Learning in Python 2
sklvq: Scikit Learning Vector Quantization 1