Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

International Conference on Machine Learning (ICML) - 2018

Documentation Rate of Empirical Papers by Reproducibility Variable

Distribution of Empirical Papers by Number of Documented Variables

Website:

Venue Year Papers
Reproducibility Score Reproducibility Score based on Gundersen et al. (2025). See Methods for details.
Documentation Score Documentation Score is the average score over the seven reproducibility variables for empirical research papers. See Methods for details.
% Empirical Percentage of papers that are empirical research vs theoretical research.
% Industry Percentage of empirical research papers with at least one author from Industry.
Website
ICML 2018 621 0.42 3.13 94.52% 40.2%
Pseudocode
Open Source Code
Open Datasets
Dataset Splits
Hardware Specification
Software Dependencies
Experiment Setup
$D^2$: Decentralized Training over Decentralized Data 3
A Boo(n) for Evaluating Architecture Performance 2
A Classification-Based Study of Covariate Shift in GAN Distributions 2
A Conditional Gradient Framework for Composite Convex Minimization with Applications to Semidefinite Programming 3
A Delay-tolerant Proximal-Gradient Algorithm for Distributed Learning 4
A Distributed Second-Order Algorithm You Can Trust 3
A Fast and Scalable Joint Estimator for Integrating Additional Knowledge in Learning Multiple Related Sparse Gaussian Graphical Models 5
A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music 3
A Primal-Dual Analysis of Global Optimality in Nonconvex Low-Rank Matrix Recovery 2
A Probabilistic Theory of Supervised Similarity Learning for Pointwise ROC Curve Optimization 1
A Progressive Batching L-BFGS Method for Machine Learning 4
A Reductions Approach to Fair Classification 5
A Robust Approach to Sequential Information Theoretic Planning 1
A Semantic Loss Function for Deep Learning with Symbolic Knowledge 4
A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates 3
A Spectral Approach to Gradient Estimation for Implicit Distributions 3
A Spline Theory of Deep Learning 2
A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations 3
A Two-Step Computation of the Exact GAN Wasserstein Distance 3
A Unified Framework for Structured Low-rank Matrix Learning 5
A probabilistic framework for multi-view feature learning with many-to-many associations via neural networks 3
ADMM and Accelerated ADMM as Continuous Dynamical Systems 1
Accelerated Spectral Ranking 4
Accelerating Greedy Coordinate Descent Methods 2
Accelerating Natural Gradient with Higher-Order Invariance 3
Accurate Inference for Adaptive Linear Models 3
Accurate Uncertainties for Deep Learning Using Calibrated Regression 3
Active Learning with Logged Data 4
Active Testing: An Efficient and Robust Framework for Estimating Accuracy 3
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost 6
Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits 3
Adaptive Sampled Softmax with Kernel Based Sampling 1
Adaptive Three Operator Splitting 3
Addressing Function Approximation Error in Actor-Critic Methods 4
Adversarial Attack on Graph Structured Data 3
Adversarial Distillation of Bayesian Neural Network Posteriors 5
Adversarial Learning with Local Coordinate Coding 4
Adversarial Regression with Multiple Learners 2
Adversarial Risk and the Dangers of Evaluating Against Weak Attacks 4
Adversarial Time-to-Event Modeling 3
Adversarially Regularized Autoencoders 5
Alternating Randomized Block Coordinate Descent 2
An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method 3
An Alternative View: When Does SGD Escape Local Minima? 2
An Efficient Semismooth Newton based Algorithm for Convex Clustering 4
An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning 1
An Estimation and Analysis Framework for the Rasch Model 3
An Inference-Based Policy Gradient Method for Learning Options 3
An Iterative, Sketching-based Framework for Ridge Regression 3
An Optimal Control Approach to Deep Learning and Applications to Discrete-Weight Neural Networks 3
Analysis of Minimax Error Rate for Crowdsourcing and Its Application to Worker Clustering Model 3
Analyzing Uncertainty in Neural Machine Translation 3
Analyzing the Robustness of Nearest Neighbors to Adversarial Examples 5
Anonymous Walk Embeddings 6
Approximate Leave-One-Out for Fast Parameter Tuning in High Dimensions 2
Approximate message passing for amplitude based optimization 2
Approximation Algorithms for Cascading Prediction Models 4
Approximation Guarantees for Adaptive Sampling 3
Asynchronous Byzantine Machine Learning (the case of SGD) 3
Asynchronous Decentralized Parallel Stochastic Gradient Descent 4
Asynchronous Stochastic Quasi-Newton MCMC for Non-Convex Optimization 3
Attention-based Deep Multiple Instance Learning 3
Augment and Reduce: Stochastic Inference for Large Categorical Distributions 4
Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data 2
AutoPrognosis: Automated Clinical Prognostic Modeling via Bayesian Optimization with Structured Kernel Learning 3
Automatic Goal Generation for Reinforcement Learning Agents 3
Autoregressive Convolutional Neural Networks for Asynchronous Time Series 5
Autoregressive Quantile Networks for Generative Modeling 2
BOCK : Bayesian Optimization with Cylindrical Kernels 5
BOHB: Robust and Efficient Hyperparameter Optimization at Scale 6
Bandits with Delayed, Aggregated Anonymous Feedback 2
Batch Bayesian Optimization via Multi-objective Acquisition Ensemble for Automated Analog Circuit Design 4
Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent 3
Bayesian Model Selection for Change Point Detection and Clustering 1
Bayesian Optimization of Combinatorial Structures 4
Bayesian Quadrature for Multiple Related Integrals 1
Bayesian Uncertainty Estimation for Batch Normalized Deep Networks 5
Been There, Done That: Meta-Learning with Episodic Recall 1
Best Arm Identification in Linear Bandits with Linear Dimension Dependency 3
Beyond 1/2-Approximation for Submodular Maximization on Massive Data Streams 3
Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations 3
Beyond the One-Step Greedy Approach in Reinforcement Learning 2
Bilevel Programming for Hyperparameter Optimization and Meta-Learning 6
Binary Classification with Karmic, Threshold-Quasi-Concave Metrics 1
Binary Partitions with Approximate Minimum Impurity 6
Black Box FDR 2
Black-Box Variational Inference for Stochastic Differential Equations 5
Black-box Adversarial Attacks with Limited Queries and Information 4
Blind Justice: Fairness with Encrypted Sensitive Attributes 4
Born Again Neural Networks 3
Bounding and Counting Linear Regions of Deep Neural Networks 4
Bounds on the Approximation Power of Feedforward Neural Networks 0
Bucket Renormalization for Approximate Inference 3
Budgeted Experiment Design for Causal Structure Learning 4
Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates 3
CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning 4
CRVI: Convex Relaxation for Variational Inference 2
Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games? 3
Candidates vs. Noises Estimation for Large Multi-Class Classification Problem 5
Canonical Tensor Decomposition for Knowledge Base Completion 5
Causal Bandits with Propagating Inference 2
Celer: a Fast Solver for the Lasso with Dual Extrapolation 4
Characterizing Implicit Bias in Terms of Optimization Geometry 1
Characterizing and Learning Equivalence Classes of Causal DAGs under Interventions 3
Chi-square Generative Adversarial Network 4
Classification from Pairwise Similarity and Unlabeled Data 5
Clipped Action Policy Gradient 3
Closed-form Marginal Likelihood in Gamma-Poisson Matrix Factorization 3
Clustering Semi-Random Mixtures of Gaussians 1
CoVeR: Learning Covariate-Specific Vector Representations with Tensor Decompositions 2
Coded Sparse Matrix Multiplication 3
Communication-Computation Efficient Gradient Coding 3
Comparing Dynamics: Deep Neural Networks versus Glassy Systems 3
Comparison-Based Random Forests 4
Competitive Caching with Machine Learned Advice 2
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations 2
Compiling Combinatorial Prediction Games 5
Composable Planning with Attributes 2
Composite Functional Gradient Learning of Generative Adversarial Models 5
Composite Marginal Likelihood Methods for Random Utility Models 2
Compressing Neural Networks using the Variational Information Bottleneck 3
Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn’s Algorithm 4
Conditional Neural Processes 2
Conditional Noise-Contrastive Estimation of Unnormalised Models 2
Configurable Markov Decision Processes 1
Constant-Time Predictive Distributions for Gaussian Processes 6
Constrained Interacting Submodular Groupings 4
Constraining the Dynamics of Deep Probabilistic Models 2
ContextNet: Deep learning for Star Galaxy Classification 2
Contextual Graph Markov Model: A Deep and Generative Approach to Graph Processing 4
Continual Reinforcement Learning with Complex Synapses 2
Continuous and Discrete-time Accelerated Stochastic Mirror Descent for Strongly Convex Functions 3
Continuous-Time Flows for Efficient Inference and Density Estimation 3
Convergence guarantees for a class of non-convex and non-smooth optimization problems 2
Convergent Tree Backup and Retrace with Function Approximation 3
Convolutional Imputation of Matrix Networks 3
Coordinated Exploration in Concurrent Reinforcement Learning 1
Covariate Adjusted Precision Matrix Estimation via Nonconvex Optimization 4
Crowdsourcing with Arbitrary Adversaries 3
Curriculum Learning by Transfer Learning: Theory and Experiments with Deep Networks 1
Cut-Pursuit Algorithm for Regularizing Nonsmooth Functionals with Graph Total Variation 4
CyCADA: Cycle-Consistent Adversarial Domain Adaptation 2
DCFNet: Deep Neural Network with Decomposed Convolutional Filters 3
DICOD: Distributed Convolutional Coordinate Descent for Convolutional Sparse Coding 5
DRACO: Byzantine-resilient Distributed Training via Redundant Gradients 5
DVAE++: Discrete Variational Autoencoders with Overlapping Transformations 3
Data Summarization at Scale: A Two-Stage Submodular Approach 5
Data-Dependent Stability of Stochastic Gradient Descent 1
Decentralized Submodular Maximization: Bridging Discrete and Continuous Settings 2
Decomposition of Uncertainty in Bayesian Deep Learning for Efficient and Risk-sensitive Learning 0
Decoupled Parallel Backpropagation with Convergence Guarantee 4
Decoupling Gradient-Like Learning Rules from Representations 1
Deep Asymmetric Multi-task Feature Learning 3
Deep Bayesian Nonparametric Tracking 3
Deep Density Destructors 5
Deep Linear Networks with Arbitrary Loss: All Local Minima Are Global 0
Deep Models of Interactions Across Sets 3
Deep One-Class Classification 3
Deep Predictive Coding Network for Object Recognition 4
Deep Reinforcement Learning in Continuous Action Spaces: a Case Study in the Game of Simulated Curling 4
Deep Variational Reinforcement Learning for POMDPs 3
Deep k-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions 3
Delayed Impact of Fair Machine Learning 2
Dependent Relational Gamma Process Models for Longitudinal Networks 3
Design of Experiments for Model Discrimination Hybridising Analytical and Data-Driven Approaches 3
Detecting and Correcting for Label Shift with Black Box Predictors 4
Detecting non-causal artifacts in multivariate linear regression models 3
DiCE: The Infinitely Differentiable Monte Carlo Estimator 3
Differentiable Abstract Interpretation for Provably Robust Neural Networks 4
Differentiable Compositional Kernel Learning for Gaussian Processes 4
Differentiable Dynamic Programming for Structured Prediction and Attention 5
Differentiable plasticity: training plastic neural networks with backpropagation 3
Differentially Private Database Release via Kernel Mean Embeddings 3
Differentially Private Identity and Equivalence Testing of Discrete Distributions 3
Differentially Private Matrix Completion Revisited 3
Dimensionality-Driven Learning with Noisy Labels 4
Discovering Interpretable Representations for Both Deep Generative and Discriminative Models 2
Discovering and Removing Exogenous State Variables and Rewards for Reinforcement Learning 2
Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms 1
Disentangled Sequential Autoencoder 3
Disentangling by Factorising 2
Dissecting Adam: The Sign, Magnitude and Variance of Stochastic Gradients 4
Dissipativity Theory for Accelerating Stochastic Variance Reduction: A Unified Analysis of SVRG and Katyusha Using Semidefinite Programs 0
Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go? 3
Distributed Clustering via LSH Based Data Partitioning 3
Distributed Nonparametric Regression under Communication Constraints 1
Do Outliers Ruin Collaboration? 1
Does Distributionally Robust Supervised Learning Give Robust Classifiers? 3
Dropout Training, Data-dependent Regularization, and Generalization Bounds 2
Dynamic Evaluation of Neural Sequence Models 3
Dynamic Regret of Strongly Adaptive Methods 1
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks 4
Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks 3
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning 3
Efficient First-Order Algorithms for Adaptive Signal Denoising 3
Efficient Gradient-Free Variational Inference using Policy Search 3
Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation 4
Efficient Neural Architecture Search via Parameters Sharing 4
Efficient Neural Audio Synthesis 3
Efficient and Consistent Adversarial Bipartite Matching 3
Efficient end-to-end learning for quantizable representations 6
End-to-End Learning for the Deep Multivariate Probit Model 5
End-to-end Active Object Tracking via Reinforcement Learning 2
Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors 3
Equivalence of Multicategory SVM and Simplex Cone SVM: Fast Computations and Statistical Theory 3
Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization 5
Error Estimation for Randomized Least-Squares Algorithms via the Bootstrap 3
Escaping Saddles with Stochastic Gradients 3
Essentially No Barriers in Neural Network Energy Landscape 4
Estimation of Markov Chain via Rank-Constrained Likelihood 2
Explicit Inductive Bias for Transfer Learning with Convolutional Networks 3
Exploiting the Potential of Standard Convolutional Autoencoders for Image Restoration by Evolutionary Search 5
Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks 5
Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples 4
Extreme Learning to Rank via Low Rank Assumption 5
Fair and Diverse DPP-Based Data Summarization 3
Fairness Without Demographics in Repeated Loss Minimization 2
Fast Approximate Spectral Clustering for Dynamic Networks 2
Fast Bellman Updates for Robust MDPs 5
Fast Decoding in Sequence Models Using Discrete Latent Variables 5
Fast Gradient-Based Methods with Exponential Rate: A Hybrid Control Framework 2
Fast Information-theoretic Bayesian Optimisation 6
Fast Maximization of Non-Submodular, Monotonic Functions on the Integer Lattice 4
Fast Parametric Learning with Activation Memorization 5
Fast Stochastic AUC Maximization with $O(1/n)$-Convergence Rate 4
Fast Variance Reduction Method with Stochastic Batch Size 3
Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow 5
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam 4
Faster Derivative-Free Stochastic Algorithm for Shared Memory Machines 3
Feasible Arm Identification 3
Feedback-Based Tree Search for Reinforcement Learning 2
Finding Influential Training Samples for Gradient Boosted Decision Trees 3
Firing Bandits: Optimizing Crowdfunding 2
First Order Generative Adversarial Networks 3
Fitting New Speakers Based on a Short Untranscribed Sample 3
Fixing a Broken ELBO 2
Focused Hierarchical RNNs for Conditional Sequence Processing 3
Fourier Policy Gradients 1
Frank-Wolfe with Subsampling Oracle 3
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents 2
Functional Gradient Boosting based on Residual Network Perception 4
GAIN: Missing Data Imputation using Generative Adversarial Nets 4
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms 3
Gated Path Planning Networks 2
Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks 3
Generalized Earley Parser: Bridging Symbolic Grammars and Sequence Data for Future Prediction 3
Generalized Robust Bayesian Committee Machine for Large-scale Gaussian Process Regression 4
Generative Temporal Models with Spatial Memory for Partially Observed Environments 2
Geodesic Convolutional Shape Optimization 3
Geometry Score: A Method For Comparing Generative Adversarial Networks 5
Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator 1
Goodness-of-Fit Testing for Discrete Distributions via Stein Discrepancy 3
GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks 6
Gradient Coding from Cyclic MDS Codes and Expander Graphs 4
Gradient Descent Learns One-hidden-layer CNN: Don’t be Afraid of Spurious Local Minima 1
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers 2
Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks 1
Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks 1
Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace 3
Gradually Updated Neural Networks for Large-Scale Image Recognition 5
Graph Networks as Learnable Physics Engines for Inference and Control 3
GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models 3
Graphical Nonconvex Optimization via an Adaptive Convex Relaxation 2
Greed is Still Good: Maximizing Monotone Submodular+Supermodular (BP) Functions 2
Hierarchical Clustering with Structural Constraints 2
Hierarchical Deep Generative Models for Multi-Rate Multivariate Time Series 4
Hierarchical Imitation and Reinforcement Learning 3
Hierarchical Long-term Video Prediction without Supervision 4
Hierarchical Multi-Label Classification Networks 3
Hierarchical Text Generation and Planning for Strategic Dialogue 2
High Performance Zero-Memory Overhead Direct Convolutions 3
High-Quality Prediction Intervals for Deep Learning: A Distribution-Free, Ensembled Approach 4
Hyperbolic Entailment Cones for Learning Hierarchical Embeddings 4
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures 3
INSPECTRE: Privately Estimating the Unseen 3
Image Transformer 5
Implicit Quantile Networks for Distributional Reinforcement Learning 2
Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval and Matrix Completion 2
Importance Weighted Transfer of Samples in Reinforcement Learning 3
Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems 2
Improved Training of Generative Adversarial Networks Using Representative Features 2
Improved large-scale graph learning through ridge spectral sparsification 3
Improved nearest neighbor search using auxiliary information and priority functions 3
Improving Optimization for Models With Continuous Symmetry Breaking 4
Improving Regression Performance with Distributional Losses 2
Improving Sign Random Projections With Additional Information 3
Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising 4
Improving the Privacy and Accuracy of ADMM-Based Distributed Algorithms 3
Inductive Two-Layer Modeling with Parametric Bregman Transfer 2
Inference Suboptimality in Variational Autoencoders 2
Information Theoretic Guarantees for Empirical Risk Minimization with Applications to Model Selection and Large-Scale Optimization 3
Inter and Intra Topic Structure Learning with Word Embeddings 4
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) 1
Invariance of Weight Distributions in Rectified MLPs 2
Investigating Human Priors for Playing Video Games 2
Is Generator Conditioning Causally Related to GAN Performance? 3
Iterative Amortized Inference 5
JointGAN: Multi-Domain Joint Distribution Learning with Generative Adversarial Nets 3
Junction Tree Variational Autoencoder for Molecular Graph Generation 3
K-Beam Minimax: Efficient Optimization for Deep Adversarial Learning 5
K-means clustering using random matrix sparsification 3
Katyusha X: Simple Momentum Method for Stochastic Sum-of-Nonconvex Optimization 1
Kernel Recursive ABC: Point Estimation with Intractable Likelihood 4
Kernelized Synaptic Weight Matrices 4
Knowledge Transfer with Jacobian Matching 1
Kronecker Recurrent Units 3
LaVAN: Localized and Visible Adversarial Noise 3
Large-Scale Cox Process Inference using Variational Fourier Features 3
Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion 6
Latent Space Policies for Hierarchical Reinforcement Learning 3
LeapsAndBounds: A Method for Approximately Optimal Algorithm Configuration 5
Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations 3
Learning Adversarially Fair and Transferable Representations 4
Learning Binary Latent Variable Models: A Tensor Eigenpair Approach 3
Learning Compact Neural Networks with Regularization 1
Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry 3
Learning Deep ResNet Blocks Sequentially using Boosting Theory 5
Learning Diffusion using Hyperparameters 2
Learning Dynamics of Linear Denoising Autoencoders 3
Learning Equations for Extrapolation and Control 3
Learning Hidden Markov Models from Pairwise Co-occurrences with Application to Topic Modeling 3
Learning Implicit Generative Models with the Method of Learned Moments 3
Learning Independent Causal Mechanisms 3
Learning K-way D-dimensional Discrete Codes for Compact Embedding Representations 4
Learning Localized Spatio-Temporal Models From Streaming Data 4
Learning Long Term Dependencies via Fourier Recurrent Units 3
Learning Longer-term Dependencies in RNNs with Auxiliary Losses 3
Learning Low-Dimensional Temporal Representations 4
Learning Maximum-A-Posteriori Perturbation Models for Structured Prediction in Polynomial Time 2
Learning Memory Access Patterns 2
Learning One Convolutional Layer with Overlapping Patches 2
Learning Policy Representations in Multiagent Systems 3
Learning Registered Point Processes from Idiosyncratic Observations 2
Learning Representations and Generative Models for 3D Point Clouds 4
Learning Semantic Representations for Unsupervised Domain Adaptation 4
Learning Steady-States of Iterative Algorithms over Graphs 5
Learning a Mixture of Two Multinomial Logits 0
Learning and Memorization 2
Learning by Playing Solving Sparse Reward Tasks from Scratch 1
Learning in Integer Latent Variable Models with Nested Automatic Differentiation 3
Learning in Reproducing Kernel Kreı̆n Spaces 3
Learning the Reward Function for a Misspecified Model 2
Learning to Act in Decentralized Partially Observable MDPs 4
Learning to Branch 5
Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems 3
Learning to Explain: An Information-Theoretic Perspective on Model Interpretation 4
Learning to Explore via Meta-Policy Gradient 4
Learning to Optimize Combinatorial Functions 2
Learning to Reweight Examples for Robust Deep Learning 4
Learning to Speed Up Structured Output Prediction 4
Learning to search with MCTSnets 1
Learning unknown ODE models with Gaussian processes 4
Learning with Abandonment 2
Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator 2
Let’s be Honest: An Optimal No-Regret Framework for Zero-Sum Games 1
Level-Set Methods for Finite-Sum Constrained Convex Optimization 3
Leveraging Well-Conditioned Bases: Streaming and Distributed Summaries in Minkowski $p$-Norms 4
Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data 3
Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design 3
Linear Spectral Estimators and an Application to Phase Retrieval 2
Lipschitz Continuity in Model-based Reinforcement Learning 3
Local Convergence Properties of SAGA/Prox-SVRG and Acceleration 3
Local Density Estimation in High Dimensions 3
Local Private Hypothesis Testing: Chi-Square Tests 2
Locally Private Hypothesis Testing 1
Loss Decomposition for Fast Learning in Large Output Spaces 3
Low-Rank Riemannian Optimization on Positive Semidefinite Stochastic Matrices with Applications to Graph Clustering 3
Lyapunov Functions for First-Order Methods: Tight Automated Convergence Guarantees 2
MAGAN: Aligning Biological Manifolds 2
MISSION: Ultra Large-Scale Feature Selection using Count-Sketches 5
MSplit LBI: Realizing Feature Selection and Dense Estimation Simultaneously in Few-shot and Zero-shot Learning 3
Machine Theory of Mind 0
Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits 1
Markov Modulated Gaussian Cox Processes for Semi-Stationary Intensity Modeling of Events Data 3
Massively Parallel Algorithms and Hardness for Single-Linkage Clustering under $\ell_p$ Distances 4
Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order 0
Max-Mahalanobis Linear Discriminant Analysis Networks 4
Mean Field Multi-Agent Reinforcement Learning 3
Measuring abstract reasoning in neural networks 4
MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels 4
Message Passing Stein Variational Gradient Descent 3
Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory 4
Minibatch Gibbs Sampling on Large Graphical Models 2
Minimal I-MAP MCMC for Scalable Structure Discovery in Causal DAG Models 3
Minimax Concave Penalized Multi-Armed Bandit Model with High-Dimensional Covariates 3
Mitigating Bias in Adaptive Data Gathering via Differential Privacy 1
Mix & Match Agent Curricula for Reinforcement Learning 1
Mixed batches and symmetric discriminators for GAN training 2
Model-Level Dual Learning 4
Modeling Others using Oneself in Multi-Agent Reinforcement Learning 3
Modeling Sparse Deviations for Compressed Sensing using Generative Models 3
More Robust Doubly Robust Off-policy Evaluation 2
Multi-Fidelity Black-Box Optimization with Hierarchical Partitions 5
Multicalibration: Calibration for the (Computationally-Identifiable) Masses 1
Mutual Information Neural Estimation 3
Near Optimal Frequent Directions for Sketching Dense and Sparse Matrices 1
Nearly Optimal Robust Subspace Tracking 3
NetGAN: Generating Graphs via Random Walks 5
Network Global Testing by Counting Graphlets 2
Neural Autoregressive Flows 3
Neural Dynamic Programming for Musical Self Similarity 4
Neural Inverse Rendering for General Reflectance Photometric Stereo 3
Neural Networks Should Be Wide Enough to Learn Disconnected Decision Regions 2
Neural Program Synthesis from Diverse Demonstration Videos 2
Neural Relational Inference for Interacting Systems 4
Noise2Noise: Learning Image Restoration without Clean Data 4
Noisin: Unbiased Regularization for Recurrent Neural Networks 4
Noisy Natural Gradient as Variational Inference 3
Non-convex Conditional Gradient Sliding 3
Non-linear motor control by local learning in spiking neural networks 2
Nonconvex Optimization for Regression with Fairness Constraints 4
Nonoverlap-Promoting Variable Selection 4
Nonparametric Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information 4
Nonparametric variable importance using an augmented neural network with multi-task learning 3
Not All Samples Are Created Equal: Deep Learning with Importance Sampling 5
Not to Cry Wolf: Distantly Supervised Multitask Learning in Critical Care 2
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples 3
On Acceleration with Noise-Corrupted Gradients 2
On Learning Sparsely Used Dictionaries from Incomplete Samples 3
On Matching Pursuit and Coordinate Descent 3
On Nesting Monte Carlo Estimators 0
On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups 0
On the Implicit Bias of Dropout 2
On the Limitations of First-Order Approximation in GAN Dynamics 1
On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization 2
On the Power of Over-parametrization in Neural Networks with Quadratic Activation 0
On the Relationship between Data Efficiency and Error for Uncertainty Sampling 4
On the Spectrum of Random Features Maps of High Dimensional Data 2
On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo 3
One-Shot Segmentation in Clutter 4
Online Convolutional Sparse Coding with Sample-Dependent Dictionary 5
Online Learning with Abstention 3
Online Linear Quadratic Control 2
Open Category Detection with PAC Guarantees 5
Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods 2
Optimal Rates of Sketched-regularized Algorithms for Least-Squares Regression over Hilbert Spaces 1
Optimal Tuning for Divide-and-conquer Kernel Ridge Regression with Massive Data 3
Optimization Landscape and Expressivity of Deep CNNs 2
Optimization, fast and slow: optimally switching between local and Bayesian optimization 3
Optimizing the Latent Space of Generative Networks 2
Orthogonal Machine Learning: Power and Limitations 3
Orthogonal Recurrent Neural Networks with Scaled Cayley Transform 4
Orthogonality-Promoting Distance Metric Learning: Convex Relaxation and Theoretical Analysis 3
Out-of-sample extension of graph adjacency spectral embedding 1
Overcoming Catastrophic Forgetting with Hard Attention to the Task 5
PDE-Net: Learning PDEs from Data 2
PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos 3
Parallel Bayesian Network Structure Learning 3
Parallel WaveNet: Fast High-Fidelity Speech Synthesis 3
Parallel and Streaming Algorithms for K-Core Decomposition 3
Parameterized Algorithms for the Matrix Completion Problem 0
Partial Optimality and Fast Lower Bounds for Weighted Correlation Clustering 4
Path Consistency Learning in Tsallis Entropy Regularized MDPs 3
Path-Level Network Transformation for Efficient Architecture Search 4
Pathwise Derivatives Beyond the Reparameterization Trick 3
PixelSNAIL: An Improved Autoregressive Generative Model 3
Policy Optimization as Wasserstein Gradient Flows 4
Policy Optimization with Demonstrations 2
Policy and Value Transfer in Lifelong Reinforcement Learning 3
Practical Contextual Bandits with Regression Oracles 4
PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning 4
Predict and Constrain: Modeling Cardinality in Deep Structured Prediction 3
Prediction Rule Reshaping 5
Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness 2
Probabilistic Boolean Tensor Decomposition 5
Probabilistic Recurrent State-Space Models 3
Probably Approximately Metric-Fair Learning 0
Problem Dependent Reinforcement Learning Bounds Which Can Identify Bandit Structure in MDPs 1
Programmatically Interpretable Reinforcement Learning 3
Progress & Compress: A scalable framework for continual learning 1
Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity 4
Proportional Allocation: Simple, Distributed, and Diverse Matching with High Entropy 1
Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope 5
Provable Variable Selection for Streaming Features 2
Pseudo-task Augmentation: From Deep Multitask Learning to Intratask Sharing—and Back 4
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning 1
QuantTree: Histograms for Change Detection in Multivariate Data Streams 4
Quasi-Monte Carlo Variational Inference 3
Quickshift++: Provably Good Initializations for Sample-Based Mean Shift 4
RLlib: Abstractions for Distributed Reinforcement Learning 5
Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors 2
RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks 3
Randomized Block Cubic Newton Method 2
Ranking Distributions based on Noisy Sorting 3
Rapid Adaptation with Conditionally Shifted Neurons 2
Rates of Convergence of Spectral Methods for Graphon Estimation 2
Rectify Heterogeneous Models with Semantic Mapping 1
Recurrent Predictive State Policy Networks 4
Regret Minimization for Partially Observable Deep Reinforcement Learning 3
Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control 1
Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training 3
Representation Learning on Graphs with Jumping Knowledge Networks 3
Representation Tradeoffs for Hyperbolic Embeddings 3
Residual Unfairness in Fair Machine Learning from Prejudiced Data 1
Revealing Common Statistical Behaviors in Heterogeneous Populations 4
Reviving and Improving Recurrent Back-Propagation 5
Riemannian Stochastic Recursive Gradient Algorithm 6
Robust and Scalable Models of Microbiome Dynamics 2
SADAGRAD: Strongly Adaptive Stochastic Gradient Methods 3
SAFFRON: an Adaptive Algorithm for Online Control of the False Discovery Rate 2
SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation 3
SGD and Hogwild! Convergence Without the Bounded Gradients Assumption 3
SMAC: Simultaneous Mapping and Clustering Using Spectral Decompositions 2
SQL-Rank: A Listwise Approach to Collaborative Ranking 6
Safe Element Screening for Submodular Function Minimization 4
Scalable Bilinear Pi Learning Using State and Action Features 1
Scalable Deletion-Robust Submodular Maximization: Data Summarization with Privacy and Fairness Constraints 2
Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF) 6
Scalable approximate Bayesian inference for particle tracking data 4
Selecting Representative Examples for Program Synthesis 2
Self-Bounded Prediction Suffix Tree via Approximate String Matching 4
Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings 2
Self-Imitation Learning 4
Semi-Amortized Variational Autoencoders 4
Semi-Implicit Variational Inference 5
Semi-Supervised Learning on Data Streams via Temporal Label Propagation 4
Semi-Supervised Learning via Compact Latent Space Clustering 4
Semiparametric Contextual Bandits 3
Shampoo: Preconditioned Stochastic Tensor Optimization 4
Signal and Noise Statistics Oblivious Orthogonal Matching Pursuit 3
Smoothed Action Value Functions for Learning Gaussian Policies 3
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor 4
Solving Partial Assignment Problems using Random Clique Complexes 4
Sound Abstraction and Decomposition of Probabilistic Programs 2
SparseMAP: Differentiable Sparse Structured Inference 5
Spatio-temporal Bayesian On-line Changepoint Detection with Model Selection 4
Spectrally Approximating Large Graphs with Smaller Graphs 3
Spline Filters For End-to-End Deep Learning 3
Spotlight: Optimizing Device Placement for Training Deep Neural Networks 4
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks 4
Stability and Generalization of Learning Algorithms that Converge to Global Optima 0
Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization 5
Stagewise Safe Bayesian Optimization with Gaussian Processes 2
State Abstractions for Lifelong Reinforcement Learning 2
State Space Gaussian Processes with Non-Gaussian Likelihood 7
Stein Points 1
Stein Variational Gradient Descent Without Gradient 3
Stein Variational Message Passing for Continuous Graphical Models 3
Stochastic PCA with $\ell_2$ and $\ell_1$ Regularization 4
Stochastic Proximal Algorithms for AUC Maximization 4
Stochastic Training of Graph Convolutional Networks with Variance Reduction 6
Stochastic Variance-Reduced Cubic Regularized Newton Methods 3
Stochastic Variance-Reduced Hamilton Monte Carlo Methods 3
Stochastic Variance-Reduced Policy Gradient 4
Stochastic Video Generation with a Learned Prior 4
Stochastic Wasserstein Barycenters 2
StrassenNets: Deep Learning with a Multiplication Budget 5
Streaming Principal Component Analysis in Noisy Setting 3
Stronger Generalization Bounds for Deep Nets via a Compression Approach 3
Structured Control Nets for Deep Reinforcement Learning 3
Structured Evolution with Compact Architectures for Scalable Policy Optimization 2
Structured Output Learning with Abstention: Application to Accurate Opinion Prediction 2
Structured Variational Learning of Bayesian Neural Networks with Horseshoe Priors 2
Structured Variationally Auto-encoded Optimization 5
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis 2
Submodular Hypergraphs: p-Laplacians, Cheeger Inequalities and Spectral Clustering 3
Subspace Embedding and Linear Regression with Orlicz Norm 3
Synthesizing Programs for Images using Reinforced Adversarial Learning 1
Synthesizing Robust Adversarial Examples 3
TACO: Learning Task Decomposition via Temporal Alignment for Control 0
TAPAS: Tricks to Accelerate (encrypted) Prediction As a Service 5
Tempered Adversarial Networks 2
Temporal Poisson Square Root Graphical Models 1
Testing Sparsity over Known and Unknown Bases 3
The Dynamics of Learning: A Random Matrix Approach 2
The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference 0
The Generalization Error of Dictionary Learning with Moreau Envelopes 0
The Hidden Vulnerability of Distributed Learning in Byzantium 3
The Hierarchical Adaptive Forgetting Variational Filter 2
The Limits of Maxing, Ranking, and Preference Learning 2
The Mechanics of n-Player Differentiable Games 3
The Mirage of Action-Dependent Baselines in Reinforcement Learning 3
The Multilinear Structure of ReLU Networks 0
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning 3
The Uncertainty Bellman Equation and Exploration 4
The Weighted Kendall and High-order Kernels for Permutations 3
The Well-Tempered Lasso 2
Theoretical Analysis of Image-to-Image Translation with Adversarial Learning 0
Theoretical Analysis of Sparse Subspace Clustering with Missing Entries 1
Thompson Sampling for Combinatorial Semi-Bandits 2
Tight Regret Bounds for Bayesian Optimization in One Dimension 1
Tighter Variational Bounds are Not Necessarily Better 2
Time Limits in Reinforcement Learning 3
To Understand Deep Learning We Need to Understand Kernel Learning 2
Topological mixture estimation 3
Towards Binary-Valued Gates for Robust LSTM Training 5
Towards Black-box Iterative Machine Teaching 2
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron 2
Towards Fast Computation of Certified Robustness for ReLU Networks 4
Towards More Efficient Stochastic Decentralized Learning: Faster Convergence and Sparse Communication 3
Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings 5
Training Neural Machines with Trace-Based Supervision 2
Transfer Learning via Learning to Transfer 3
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement 2
Transformation Autoregressive Networks 4
Tree Edit Distance Learning via Adaptive Symbol Embeddings 5
Tropical Geometry of Deep Neural Networks 0
Unbiased Objective Estimation in Predictive Optimization 5
Understanding Generalization and Optimization Performance of Deep CNNs 0
Understanding and Simplifying One-Shot Architecture Search 4
Understanding the Loss Surface of Neural Networks for Binary Classification 0
Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control 1
Using Inherent Structures to design Lean 2-layer RBMs 4
Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning 4
Variable Selection via Penalized Neural Network: a Drop-Out-One Loss Approach 4
Variance Regularized Counterfactual Risk Minimization via Variational Divergence Minimization 6
Variational Bayesian dropout: pitfalls and fixes 0
Variational Inference and Model Selection with Generalized Evidence Bounds 2
Variational Network Inference: Strong and Stable with Concrete Support 2
Video Prediction with Appearance and Motion Conditions 5
Visualizing and Understanding Atari Agents 3
WHInter: A Working set algorithm for High-dimensional sparse second order Interaction models 4
WSNet: Compact and Efficient Networks Through Weight Sampling 4
Weakly Consistent Optimal Pricing Algorithms in Repeated Posted-Price Auctions with Strategic Buyer 1
Weakly Submodular Maximization Beyond Cardinality Constraints: Does Randomization Help Greedy? 3
Weightless: Lossy weight encoding for deep neural network compression 4
Which Training Methods for GANs do actually Converge? 1
Yes, but Did It Work?: Evaluating Variational Inference 3
oi-VAE: Output Interpretable VAEs for Nonlinear Group Factor Analysis 3
prDeep: Robust Phase Retrieval with a Flexible Deep Network 5
signSGD: Compressed Optimisation for Non-Convex Problems 5