Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Journal of Machine Learning Research (JMLR) - 2025

Documentation Rate of Empirical Papers by Reproducibility Variable

Distribution of Empirical Papers by Number of Documented Variables

Website:

Venue Year Papers
Reproducibility Score Reproducibility Score based on Gundersen et al. (2025). See Methods for details.
Documentation Score Documentation Score is the average score over the seven reproducibility variables for empirical research papers. See Methods for details.
% Empirical Percentage of papers that are empirical research vs theoretical research.
% Industry Percentage of empirical research papers with at least one author from Industry.
Website
JMLR 2025 269 0.46 3.79 86.99% 17.52%
Pseudocode
Open Source Code
Open Datasets
Dataset Splits
Hardware Specification
Software Dependencies
Experiment Setup
"What is Different Between These Datasets?" A Framework for Explaining Data Distribution Shifts 4
(De)-regularized Maximum Mean Discrepancy Gradient Flow 4
A Comparative Evaluation of Quantification Methods 4
A Decentralized Proximal Gradient Tracking Algorithm for Composite Optimization on Riemannian Manifolds 4
A Hybrid Weighted Nearest Neighbour Classifier for Semi-Supervised Learning 3
A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization 4
A Random Matrix Approach to Low-Multilinear-Rank Tensor Approximation 2
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs 3
A Unified Framework to Enforce, Discover, and Promote Symmetry in Machine Learning 4
Accelerating optimization over the space of probability measures 2
Actor-Critic learning for mean-field control in continuous time 2
Adaptive Client Sampling in Federated Learning via Online Learning with Bandit Feedback 5
Adaptive Distributed Kernel Ridge Regression: A Feasible Distributed Learning Scheme for Data Silos 6
Adjusted Expected Improvement for Cumulative Regret Minimization in Noisy Bayesian Optimization 4
Affine Rank Minimization via Asymptotic Log-Det Iteratively Reweighted Least Squares 3
Algorithms for ridge estimation with convergence guarantees 3
An Adaptive Parameter-free and Projection-free Restarting Level Set Method for Constrained Convex Optimization Under the Error Bound Condition 4
An Asymptotically Optimal Coordinate Descent Algorithm for Learning Bayesian Networks from Gaussian Models 6
An Augmentation Overlap Theory of Contrastive Learning 4
An Axiomatic Definition of Hierarchical Clustering 0
Are Ensembles Getting Better All the Time? 4
Assumption-lean and data-adaptive post-prediction inference 5
Asymptotic Inference for Multi-Stage Stationary Treatment Policy with Variable Selection 5
Autoencoders in Function Space 5
Bagged Regularized k-Distances for Anomaly Detection 3
Bagged k-Distance for Mode-Based Clustering Using the Probability of Localized Level Sets 4
Bayes Meets Bernstein at the Meta Level: an Analysis of Fast Rates in Meta-Learning with PAC-Bayes 0
Bayesian Data Sketching for Varying Coefficient Regression Models 6
Bayesian Multi-Group Gaussian Process Models for Heterogeneous Group-Structured Data 5
Bayesian Scalar-on-Image Regression with a Spatially Varying Single-layer Neural Network Prior 4
Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions 5
Best Linear Unbiased Estimate from Privatized Contingency Tables 5
Biological Sequence Kernels with Guaranteed Flexibility 4
BitNet: 1-bit Pre-training for Large Language Models 4
BoFire: Bayesian Optimization Framework Intended for Real Experiments 1
Boosting Causal Additive Models 5
Calibrated Inference: Statistical Inference that Accounts for Both Sampling Uncertainty and Distributional Uncertainty 5
Categorical Semantics of Compositional Reinforcement Learning 0
Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability 2
Causal Effect of Functional Treatment 5
Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning 0
Classification in the high dimensional Anisotropic mixture framework: A new take on Robust Interpolation 1
ClimSim-Online: A Large Multi-Scale Dataset and Framework for Hybrid Physics-ML Climate Emulation 6
Collaborative likelihood-ratio estimation over graphs 5
Composite Goodness-of-fit Tests with Kernels 5
Conditional Wasserstein Distances with Applications in Bayesian OT Flow Matching 4
Contextual Bandits with Stage-wise Constraints 2
Continuously evolving rewards in an open-ended environment 2
Convergence Rates for Non-Log-Concave Sampling and Log-Partition Estimation 3
Convergence and Sample Complexity of Natural Policy Gradient Primal-Dual Methods for Constrained MDPs 3
Copula-based Sensitivity Analysis for Multi-Treatment Causal Inference with Unobserved Confounding 3
Curvature-based Clustering on Graphs 3
DAGs as Minimal I-maps for the Induced Models of Causal Bayesian Networks under Conditioning 5
DRM Revisited: A Complete Error Analysis 1
Data-Driven Performance Guarantees for Classical and Learned Optimizers 5
Decentralized Asynchronous Optimization with DADAO allows Decoupling and Acceleration 3
Decentralized Bilevel Optimization: A Perspective from Transient Iteration Complexity 4
Decentralized Sparse Linear Regression via Gradient-Tracking 3
Deep Generative Models: Complexity, Dimensionality, and Approximation 3
Deep Neural Networks are Adaptive to Function Regularity and Data Distribution in Approximation and Estimation 2
Deep Out-of-Distribution Uncertainty Quantification via Weight Entropy Maximization 5
Deep Variational Multivariate Information Bottleneck - A Framework for Variational Losses 3
Degree of Interference: A General Framework For Causal Inference Under Interference 4
Deletion Robust Non-Monotone Submodular Maximization over Matroids 1
Density Estimation Using the Perceptron 0
Derivative-Informed Neural Operator Acceleration of Geometric MCMC for Infinite-Dimensional Bayesian Inverse Problems 4
Determine the Number of States in Hidden Markov Models via Marginal Likelihood 4
Diffeomorphism-based feature learning using Poincaré inequalities on augmented input space 4
Differentially Private Bootstrap: New Privacy Analysis and Inference Strategies 4
Differentially Private Multivariate Medians 3
Directed Cyclic Graphs for Simultaneous Discovery of Time-Lagged and Instantaneous Causality from Longitudinal Data Using Instrumental Variables 5
DisC2o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data 2
Distributed Stochastic Bilevel Optimization: Improved Complexity and Heterogeneity Analysis 3
Distribution Estimation under the Infinity Norm 2
Distribution Free Tests for Model Selection Based on Maximum Mean Discrepancy with Estimated Parameters 4
Dynamic Bayesian Learning for Spatiotemporal Mechanistic Models 6
Dynamic angular synchronization under smoothness constraints 3
EF21 with Bells & Whistles: Six Algorithmic Extensions of Modern Error Feedback 6
EMaP: Explainable AI with Manifold-based Perturbations 5
Early Alignment in Two-Layer Networks Training is a Two-Edged Sword 2
Efficient Knowledge Deletion from Trained Models Through Layer-wise Partial Machine Unlearning 6
Efficient Methods for Non-stationary Online Learning 3
Efficient Numerical Integration in Reproducing Kernel Hilbert Spaces via Leverage Scores Sampling 5
Efficient Online Prediction for High-Dimensional Time Series via Joint Tensor Tucker Decomposition 6
Efficient and Robust Semi-supervised Estimation of Average Treatment Effect with Partially Annotated Treatment and Response 3
Efficient and Robust Transfer Learning of Optimal Individualized Treatment Regimes with Right-Censored Survival Data 6
Efficiently Escaping Saddle Points in Bilevel Optimization 2
Enhanced Feature Learning via Regularisation: Integrating Neural Networks and Kernel Methods 5
Enhancing Graph Representation Learning with Localized Topological Features 6
Error bounds for particle gradient descent, and extensions of the log-Sobolev and Talagrand inequalities 1
Error estimation and adaptive tuning for unregularized robust M-estimator 2
Estimating Network-Mediated Causal Effects via Principal Components Network Regression 3
Estimation of Local Geometric Structure on Manifolds from Noisy Data 3
Evaluation of Active Feature Acquisition Methods for Time-varying Feature Settings 3
Exponential Family Graphical Models: Correlated Replicates and Unmeasured Confounders, with Applications to fMRI Data 4
Extending Temperature Scaling with Homogenizing Maps 3
Extremal graphical modeling with latent variables via convex optimization 4
Fair Text Classification via Transferable Representations 5
Fast Algorithm for Constrained Linear Inverse Problems 5
Fast Computation of Superquantile-Constrained Optimization Through Implicit Scenario Reduction 6
Feature Learning in Finite-Width Bayesian Deep Linear Networks with Multiple Outputs and Convolutional Layers 0
Fine-Grained Change Point Detection for Topic Modeling with Pitman-Yor Process 3
Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems 1
Finite Expression Method for Solving High-Dimensional Partial Differential Equations 3
Four Axiomatic Characterizations of the Integrated Gradients Attribution Method 0
Frequentist Guarantees of Distributed (Non)-Bayesian Inference 0
From Sparse to Dense Functional Data in High Dimensions: Revisiting Phase Transitions from a Non-Asymptotic Perspective 1
Frontiers to the learning of nonparametric hidden Markov models 1
Fundamental Limits of Membership Inference Attacks on Machine Learning Models 3
General Loss Functions Lead to (Approximate) Interpolation in High Dimensions 2
Generalized multi-view model: Adaptive density estimation under low-rank constraints 4
Generation of Geodesics with Actor-Critic Reinforcement Learning to Predict Midpoints 4
Generative Adversarial Networks: Dynamics 0
Geometry and Stability of Supervised Learning Problems 0
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 6
Graph-accelerated Markov Chain Monte Carlo using Approximate Samples 4
GraphNeuralNetworks.jl: Deep Learning on Graphs with Julia 2
Hierarchical Decision Making Based on Structural Information Principles 5
Hierarchical and Stochastic Crystallization Learning: Geometrically Leveraged Nonparametric Regression with Delaunay Triangulation 4
High-Dimensional L2-Boosting: Rate of Convergence 5
High-Rank Irreducible Cartesian Tensor Decomposition and Bases of Equivariant Spaces 5
Hopfield-Fenchel-Young Networks: A Unified Framework for Associative Memory Retrieval 5
How good is your Laplace approximation of the Bayesian posterior? Finite-sample computable error bounds for a variety of useful divergences 3
Identifiability of Causal Graphs under Non-Additive Conditionally Parametric Causal Models 5
Implicit vs Unfolded Graph Neural Networks 5
Imprecise Multi-Armed Bandits: Representing Irreducible Uncertainty as a Zero-Sum Game 1
Improving Graph Neural Networks on Multi-node Tasks with the Labeling Trick 5
Inferring Change Points in High-Dimensional Regression via Approximate Message Passing 4
Infinite-dimensional Mahalanobis Distance with Applications to Kernelized Novelty Detection 5
Instability, Computational Efficiency and Statistical Accuracy 0
Integral Probability Metrics Meet Neural Networks: The Radon-Kolmogorov-Smirnov Test 3
Interpretable Global Minima of Deep ReLU Neural Networks on Sequentially Separable Data 0
Invariant Subspace Decomposition 5
Jackpot: Approximating Uncertainty Domains with Adversarial Manifolds 3
Laplace Meets Moreau: Smooth Approximation to Infimal Convolutions Using Laplace's Method 2
Last-iterate Convergence of Shuffling Momentum Gradient Method under the Kurdyka-Lojasiewicz Inequality 1
Latent Process Models for Functional Network Data 6
Learning Global Nash Equilibrium in Team Competitive Games with Generalized Fictitious Cross-Play 4
Learning causal graphs via nonlinear sufficient dimension reduction 3
Learning conditional distributions on continuous spaces 4
Learning from Similar Linear Representations: Adaptivity, Minimaxity, and Robustness 6
Learning with Linear Function Approximations in Mean-Field Control 1
Learning-to-Optimize with PAC-Bayesian Guarantees: Theoretical Considerations and Practical Implementation 5
Lexicographic Lipschitz Bandits: New Algorithms and a Lower Bound 2
Lightning UQ Box: Uncertainty Quantification for Neural Networks 3
Linear Hypothesis Testing in High-Dimensional Expected Shortfall Regression with Heavy-Tailed Errors 3
Linear Separation Capacity of Self-Supervised Representation Learning 3
Linear cost and exponentially convergent approximation of Gaussian Matérn processes on intervals 5
Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization 2
Locally Private Causal Inference for Randomized Experiments 3
Losing Momentum in Continuous-time Stochastic Optimisation 4
Manifold Fitting under Unbounded Noise 5
Maximum Causal Entropy IRL in Mean-Field Games and GNEP Framework for Forward RL 2
Mean Aggregator is More Robust than Robust Aggregators under Label Poisoning Attacks on Distributed Heterogeneous Data 5
Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents 5
Minimax Optimal Deep Neural Network Classifiers Under Smooth Decision Boundary 2
Minimax Optimal Two-Sample Testing under Local Differential Privacy 2
Mixtures of Gaussian Process Experts with SMC^2 4
Model-free Change-Point Detection Using AUC of a Classifier 6
Modelling Populations of Interaction Networks via Distance Metrics 4
Multiple Instance Verification 6
Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles 5
Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning 6
Nonparametric Regression on Random Geometric Graphs Sampled from Submanifolds 0
On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations 2
On Consistent Bayesian Inference from Synthetic Data 3
On Global and Local Convergence of Iterative Linear Quadratic Optimization Algorithms for Discrete Time Nonlinear Control 3
On Inference for the Support Vector Machine 1
On Model Identification and Out-of-Sample Prediction of PCR with Applications to Synthetic Controls 5
On Non-asymptotic Theory of Recurrent Neural Networks in Temporal Point Processes 0
On Probabilistic Embeddings in Optimal Dimension Reduction 1
On the Ability of Deep Networks to Learn Symmetries from Data: A Neural Kernel Theory 4
On the Approximation of Kernel functions 0
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes 1
On the Natural Gradient of the Evidence Lower Bound 1
On the O(sqrt(d)/T^(1/4)) Convergence Rate of RMSProp and Its Momentum Extension Measured by l_1 Norm 5
On the Representation of Pairwise Causal Background Knowledge and Its Applications in Causal Inference 2
On the Robustness of Kernel Goodness-of-Fit Tests 5
On the Statistical Properties of Generative Adversarial Models for Low Intrinsic Data Dimension 0
On the Utility of Equal Batch Sizes for Inference in Stochastic Gradient Descent 4
Online Quantile Regression 4
Ontolearn---A Framework for Large-scale OWL Class Expression Learning in Python 1
Operator Learning for Hyperbolic PDEs 2
Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity 4
Optimal Experiment Design for Causal Effect Identification 4
Optimal Rates of Kernel Ridge Regression under Source Condition in Large Dimensions 0
Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning 5
Optimal and Efficient Algorithms for Decentralized Online Convex Optimization 1
Optimal subsampling for high-dimensional partially linear models via machine learning methods 5
Optimization Over a Probability Simplex 4
Optimizing Data Collection for Machine Learning 5
Optimizing Return Distributions with Distributional Dynamic Programming 3
Orthogonal Bases for Equivariant Graph Learning with Provable k-WL Expressive Power 4
Outlier Robust and Sparse Estimation of Linear Regression Coefficients 2
PFLlib: A Beginner-Friendly and Comprehensive Personalized Federated Learning Library and Benchmark 3
PREMAP: A Unifying PREiMage APproximation Framework for Neural Networks 4
Physics Informed Kolmogorov-Arnold Neural Networks for Dynamical Analysis via Efficient-KAN and WAV-KAN 4
Physics-informed Kernel Learning 3
Piecewise deterministic sampling with splitting schemes 3
Posterior Concentrations of Fully-Connected Bayesian Neural Networks with General Priors on the Weights 0
Posterior and Variational Inference for Deep Neural Networks with Heavy-Tailed Weights 0
Precise High-Dimensional Asymptotics for Quantifying Heterogeneous Transfers 2
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF 3
Prominent Roles of Conditionally Invariant Components in Domain Adaptation: Theory and Algorithms 4
Quantifying the Effectiveness of Linear Preconditioning in Markov Chain Monte Carlo 2
Random Pruning Over-parameterized Neural Networks Can Improve Generalization: A Training Dynamics Analysis 4
Random ReLU Neural Networks as Non-Gaussian Processes 0
Randomization Can Reduce Both Bias and Variance: A Case Study in Random Forests 3
Randomly Projected Convex Clustering Model: Motivation, Realization, and Cluster Recovery Guarantees 3
Rank-one Convexification for Sparse Regression 5
Recursive Causal Discovery 6
Regularized Rényi Divergence Minimization through Bregman Proximal Gradient Algorithms 3
Regularizing Hard Examples Improves Adversarial Robustness 6
Reinforcement Learning for Infinite-Dimensional Systems 3
Relaxed Gaussian Process Interpolation: a Goal-Oriented Approach to Bayesian Optimization 4
Reliever: Relieving the Burden of Costly Model Fits for Changepoint Detection 1
Revisiting Gradient Normalization and Clipping for Nonconvex SGD under Heavy-Tailed Noise: Necessity, Sufficiency, and Acceleration 1
Riemannian Bilevel Optimization 5
Robust Point Matching with Distance Profiles 3
Sample Complexity of the Linear Quadratic Regulator: A Reinforcement Learning Lens 2
Sampling and Estimation on Manifolds using the Langevin Diffusion 3
Scalable and Adaptive Variational Bayes Methods for Hawkes Processes 4
Scaling Capability in Token Space: An Analysis of Large Vision Language Model 5
Scaling Data-Constrained Language Models 5
Scaling ResNets in the Large-depth Regime 3
Score-Aware Policy-Gradient and Performance Guarantees using Local Lyapunov Stability 3
Score-Based Diffusion Models in Function Space 5
Score-based Causal Representation Learning: Linear and General Transformations 4
Selective Inference with Distributed Data 5
Sharp Bounds for Sequential Federated Learning on Heterogeneous Data 5
Simplex Constrained Sparse Optimization via Tail Screening 3
Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds 6
Sparse SVM with Hard-Margin Loss: a Newton-Augmented Lagrangian Method in Reduced Dimensions 6
Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data 4
Stabilizing Sharpness-Aware Minimization Through A Simple Renormalization Strategy 6
Stable learning using spiking neural networks equipped with affine encoders and decoders 4
Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming 3
Statistical Inference of Random Graphs With a Surrogate Likelihood Function 3
Statistical field theory for Markov decision processes under uncertainty 1
Stochastic Interior-Point Methods for Smooth Conic Optimization with Applications 6
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions 3
Supervised Learning with Evolving Tasks and Performance Guarantees 5
System Neural Diversity: Measuring Behavioral Heterogeneity in Multi-Agent Learning 3
Talent: A Tabular Analytics and Learning Toolbox 5
Test-Time Training on Video Streams 5
The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond 2
The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning 1
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise 0
TorchCP: A Python Library for Conformal Prediction 5
Towards Optimal Branching of Linear and Semidefinite Relaxations for Neural Network Robustness Certification 5
Towards Understanding Gradient Flow Dynamics of Homogeneous Neural Networks Beyond the Origin 2
Towards Unified Native Spaces in Kernel Methods 0
Transformers from Diffusion: A Unified Framework for Neural Message Passing 5
Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization 4
Unbalanced Kantorovich-Rubinstein distance, plan, and barycenter on nite spaces: A statistical perspective 2
Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination 5
Unified Discrete Diffusion for Categorical Data 6
Universal Online Convex Optimization Meets Second-order Bounds 1
Universality of Kernel Random Matrices and Kernel Regression in the Quadratic Regime 1
Uplift Model Evaluation with Ordinal Dominance Graphs 3
VFOSA: Variance-Reduced Fast Operator Splitting Algorithms for Generalized Equations 4
Variance-Aware Estimation of Kernel Mean Embedding 1
Variational Inference for Uncertainty Quantification: an Analysis of Trade-offs 3
WEFE: A Python Library for Measuring and Mitigating Bias in Word Embeddings 1
Wasserstein Convergence Guarantees for a General Class of Score-Based Generative Models 3
Wasserstein F-tests for Frechet regression on Bures-Wasserstein manifolds 4
depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers 3
gsplat: An Open-Source Library for Gaussian Splatting 7
skglm: Improving scikit-learn for Regularized Generalized Linear Models 3