Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

International Conference on Machine Learning (ICML) - 2017

Documentation Rate of Empirical Papers by Reproducibility Variable

Distribution of Empirical Papers by Number of Documented Variables

Website:

Venue	Year	Papers	Reproducibility Score Reproducibility Score based on Gundersen et al. (2025). See Methods for details.	Documentation Score Documentation Score is the average score over the seven reproducibility variables for empirical research papers. See Methods for details.	% Empirical Percentage of papers that are empirical research vs theoretical research.	% Industry Percentage of empirical research papers with at least one author from Industry.	Website
ICML	2017	434	0.39	3.15	92.17%	41.25%

Search Papers

	Pseudocode	Open Source Code	Open Datasets	Dataset Splits	Hardware Specification	Software Dependencies	Experiment Setup
A Birth-Death Process for Feature Allocation	❌	❌	✅	✅	❌	❌	✅	3
A Closer Look at Memorization in Deep Networks	✅	❌	✅	❌	❌	❌	✅	3
A Distributional Perspective on Reinforcement Learning	✅	❌	✅	❌	❌	❌	✅	3
A Divergence Bound for Hybrids of MCMC and Variational Inference and an Application to Langevin Dynamics and SGVI	❌	❌	✅	❌	❌	❌	✅	2
A Laplacian Framework for Option Discovery in Reinforcement Learning	❌	✅	✅	❌	❌	❌	✅	3
A Richer Theory of Convex Constrained Optimization with Reduced Projections and Improved Rates	✅	❌	✅	❌	❌	❌	✅	3
A Semismooth Newton Method for Fast, Generic Convex Programming	✅	❌	❌	❌	❌	❌	✅	2
A Simple Multi-Class Boosting Framework with Theoretical Guarantees and Empirical Proficiency	❌	❌	✅	✅	❌	❌	✅	3
A Simulated Annealing Based Inexact Oracle for Wasserstein Loss Minimization	✅	❌	✅	❌	✅	❌	✅	4
A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions	❌	❌	❌	❌	❌	❌	❌	0
A Unified Variance Reduction-Based Framework for Nonconvex Low-Rank Matrix Recovery	✅	❌	✅	❌	❌	❌	✅	3
A Unified View of Multi-Label Performance Measures	✅	❌	✅	❌	❌	❌	✅	3
Accelerating Eulerian Fluid Simulation With Convolutional Networks	✅	✅	✅	❌	✅	❌	✅	5
Active Heteroscedastic Regression	✅	❌	✅	❌	❌	❌	❌	2
Active Learning for Accurate Estimation of Linear Models	✅	❌	✅	❌	❌	❌	❌	2
Active Learning for Cost-Sensitive Classification	✅	❌	✅	❌	❌	❌	✅	3
Active Learning for Top-$K$ Rank Aggregation from Noisy Comparisons	✅	✅	❌	❌	❌	❌	✅	3
AdaNet: Adaptive Structural Learning of Artificial Neural Networks	✅	❌	✅	✅	❌	❌	✅	4
Adapting Kernel Representations Online Using Submodular Maximization	✅	❌	✅	❌	❌	❌	✅	3
Adaptive Consensus ADMM for Distributed Optimization	✅	❌	✅	❌	✅	❌	✅	4
Adaptive Feature Selection: Computationally Efficient Online Sparse Linear Regression under RIP	✅	❌	❌	❌	❌	❌	❌	1
Adaptive Multiple-Arm Identification	✅	❌	❌	❌	❌	❌	✅	2
Adaptive Neural Networks for Efficient Inference	✅	❌	✅	✅	✅	✅	✅	6
Adaptive Sampling Probabilities for Non-Smooth Optimization	✅	❌	✅	❌	❌	❌	✅	3
Adversarial Feature Matching for Text Generation	❌	❌	✅	✅	✅	❌	✅	4
Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks	✅	❌	✅	❌	❌	❌	✅	3
Algebraic Variety Models for High-Rank Matrix Completion	✅	❌	✅	❌	❌	❌	✅	3
Algorithmic Stability and Hypothesis Complexity	❌	❌	❌	❌	❌	❌	❌	0
Algorithms for $\ell_p$ Low-Rank Approximation	✅	❌	✅	❌	❌	❌	❌	2
An Adaptive Test of Independence with Analytic Kernel Embeddings	❌	✅	✅	✅	❌	❌	✅	4
An Alternative Softmax Operator for Reinforcement Learning	✅	❌	✅	❌	❌	❌	✅	3
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis	❌	✅	✅	❌	❌	❌	❌	2
An Efficient, Sparsity-Preserving, Online Algorithm for Low-Rank Approximation	✅	❌	✅	✅	❌	❌	✅	4
An Infinite Hidden Markov Model With Similarity-Biased Transitions	❌	✅	✅	✅	❌	❌	✅	4
Analogical Inference for Multi-relational Embeddings	❌	✅	✅	✅	❌	❌	✅	4
Analysis and Optimization of Graph Decompositions by Lifted Multicuts	❌	✅	✅	❌	❌	❌	✅	3
Analytical Guarantees on Numerical Precision of Deep Neural Networks	❌	❌	✅	❌	❌	❌	✅	2
Approximate Newton Methods and Their Local Convergence	✅	❌	❌	❌	❌	❌	✅	2
Approximate Steepest Coordinate Descent	✅	❌	✅	❌	❌	❌	✅	3
Asymmetric Tri-training for Unsupervised Domain Adaptation	✅	❌	✅	✅	❌	❌	✅	4
Asynchronous Distributed Variational Gaussian Process for Regression	✅	❌	✅	✅	✅	❌	✅	5
Asynchronous Stochastic Gradient Descent with Delay Compensation	✅	❌	✅	✅	✅	❌	✅	5
Attentive Recurrent Comparators	❌	❌	✅	✅	❌	❌	✅	3
Automated Curriculum Learning for Neural Networks	✅	❌	✅	❌	❌	❌	✅	3
Automatic Discovery of the Statistical Types of Variables in a Dataset	✅	✅	✅	❌	❌	❌	✅	4
Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning	✅	✅	✅	❌	❌	❌	✅	4
Axiomatic Attribution for Deep Networks	❌	✅	✅	❌	❌	❌	❌	2
Batched High-dimensional Bayesian Optimization via Structural Kernel Learning	❌	✅	❌	❌	❌	❌	✅	2
Bayesian Boolean Matrix Factorisation	✅	✅	✅	❌	❌	❌	✅	4
Bayesian Models of Data Streams with Hierarchical Power Priors	❌	✅	✅	✅	❌	❌	✅	4
Bayesian Optimization with Tree-structured Dependencies	❌	❌	✅	❌	✅	❌	✅	3
Bayesian inference on random simple graphs with power law degree distributions	❌	❌	✅	✅	❌	❌	✅	3
Being Robust (in High Dimensions) Can Be Practical	✅	❌	✅	❌	✅	❌	✅	4
Beyond Filters: Compact Feature Map for Portable Deep Model	✅	✅	✅	✅	✅	❌	✅	6
Bidirectional Learning for Time-series Models with Hidden Units	✅	❌	✅	❌	❌	❌	✅	3
Boosted Fitted Q-Iteration	✅	❌	✅	❌	❌	❌	✅	3
Bottleneck Conditional Density Estimation	❌	✅	✅	✅	❌	❌	✅	4
Breaking Locality Accelerates Block Gauss-Seidel	✅	❌	✅	❌	✅	✅	✅	5
Canopy Fast Sampling with Cover Trees	❌	❌	✅	❌	✅	❌	✅	3
Capacity Releasing Diffusion for Speed and Locality	✅	❌	✅	❌	❌	❌	❌	2
ChoiceRank: Identifying Preferences from Node Traffic in Networks	✅	✅	✅	❌	✅	❌	✅	5
Clustering High Dimensional Dynamic Data Streams	✅	❌	❌	❌	❌	❌	✅	2
Clustering by Sum of Norms: Stochastic Incremental Algorithm, Convergence and Cluster Recovery	✅	❌	❌	❌	❌	❌	❌	1
Co-clustering through Optimal Transport	✅	❌	✅	❌	❌	❌	✅	3
Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study	❌	❌	✅	❌	❌	❌	✅	2
Coherence Pursuit: Fast, Simple, and Robust Subspace Recovery	✅	❌	✅	❌	❌	❌	✅	3
Coherent Probabilistic Forecasts for Hierarchical Time Series	✅	❌	✅	✅	❌	❌	✅	4
Collect at Once, Use Effectively: Making Non-interactive Locally Private Learning Possible	✅	❌	❌	❌	❌	❌	❌	1
Combined Group and Exclusive Sparsity for Deep Neural Networks	✅	✅	✅	✅	❌	❌	❌	4
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning	✅	✅	✅	❌	✅	❌	✅	5
Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis	✅	❌	❌	❌	❌	❌	✅	2
Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data	✅	❌	✅	❌	✅	❌	✅	4
Compressed Sensing using Generative Models	❌	✅	✅	❌	❌	❌	✅	3
Conditional Accelerated Lazy Stochastic Gradient Descent	✅	❌	❌	❌	❌	❌	✅	2
Conditional Image Synthesis with Auxiliary Classifier GANs	❌	❌	✅	❌	❌	❌	✅	2
Confident Multiple Choice Learning	✅	✅	✅	✅	❌	❌	✅	5
Connected Subgraph Detection with Mirror Descent on SDPs	✅	❌	✅	❌	✅	❌	✅	4
Consistency Analysis for Binary Classification Revisited	✅	❌	✅	✅	❌	❌	❌	3
Consistent On-Line Off-Policy Evaluation	✅	❌	✅	❌	❌	❌	✅	3
Consistent k-Clustering	✅	❌	✅	❌	❌	❌	❌	2
Constrained Policy Optimization	✅	✅	❌	❌	❌	❌	❌	2
Contextual Decision Processes with low Bellman rank are PAC-Learnable	✅	❌	❌	❌	❌	❌	❌	1
Continual Learning Through Synaptic Intelligence	❌	❌	✅	✅	❌	❌	✅	3
Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization	✅	❌	❌	❌	❌	❌	✅	2
Convex Phase Retrieval without Lifting via PhaseMax	❌	❌	❌	❌	❌	❌	❌	0
Convexified Convolutional Neural Networks	✅	✅	✅	✅	❌	❌	✅	5
Convolutional Sequence to Sequence Learning	❌	✅	✅	✅	✅	❌	✅	5
Coordinated Multi-Agent Imitation Learning	✅	❌	❌	❌	❌	❌	✅	2
Coresets for Vector Summarization with Applications to Network Graphs	✅	❌	✅	❌	❌	❌	❌	2
Cost-Optimal Learning of Causal Graphs	✅	❌	✅	❌	❌	❌	✅	3
Count-Based Exploration with Neural Density Models	❌	❌	✅	❌	❌	❌	✅	2
Counterfactual Data-Fusion for Online Reinforcement Learners	❌	✅	❌	❌	❌	❌	✅	2
Coupling Distributed and Symbolic Execution for Natural Language Queries	❌	❌	✅	✅	✅	❌	✅	4
Curiosity-driven Exploration by Self-supervised Prediction	❌	❌	✅	❌	❌	❌	❌	1
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning	❌	❌	✅	❌	❌	❌	✅	2
Dance Dance Convolution	❌	✅	✅	✅	✅	❌	✅	5
Data-Efficient Policy Evaluation Through Behavior Policy Search	✅	❌	❌	❌	❌	❌	✅	2
Deciding How to Decide: Dynamic Routing in Artificial Neural Networks	❌	✅	✅	❌	❌	❌	✅	3
Decoupled Neural Interfaces using Synthetic Gradients	❌	❌	✅	✅	❌	❌	✅	3
Deep Bayesian Active Learning with Image Data	❌	✅	✅	✅	❌	❌	✅	4
Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability	❌	❌	✅	❌	❌	❌	✅	2
Deep Generative Models for Relational Data with Side Information	❌	❌	✅	❌	✅	❌	✅	3
Deep IV: A Flexible Approach for Counterfactual Prediction	❌	✅	✅	❌	❌	❌	❌	2
Deep Latent Dirichlet Allocation with Topic-Layer-Adaptive Stochastic Gradient Riemannian MCMC	✅	❌	✅	✅	❌	❌	✅	4
Deep Spectral Clustering Learning	✅	❌	✅	❌	✅	❌	✅	4
Deep Tensor Convolution on Multicores	✅	❌	✅	❌	✅	✅	✅	5
Deep Transfer Learning with Joint Adaptation Networks	❌	✅	✅	✅	❌	❌	✅	4
Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs	✅	✅	✅	✅	❌	❌	✅	5
Deep Voice: Real-time Neural Text-to-Speech	❌	❌	✅	❌	✅	❌	✅	3
DeepBach: a Steerable Model for Bach Chorales Generation	✅	✅	✅	✅	❌	❌	✅	5
Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction	✅	❌	✅	✅	❌	❌	✅	4
Deletion-Robust Submodular Maximization: Data Summarization with “the Right to be Forgotten”	✅	❌	✅	✅	✅	❌	✅	5
Delta Networks for Optimized Recurrent Network Computation	❌	❌	✅	✅	✅	❌	✅	4
Density Level Set Estimation on Manifolds with DBSCAN	✅	❌	❌	❌	❌	❌	❌	1
Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks	❌	❌	❌	✅	❌	❌	✅	2
Deriving Neural Architectures from Sequence and Graph Kernels	❌	✅	✅	✅	❌	❌	✅	4
Developing Bug-Free Machine Learning Systems With Formal Mathematics	✅	✅	✅	❌	❌	❌	❌	3
Device Placement Optimization with Reinforcement Learning	❌	❌	✅	✅	✅	❌	✅	4
Diameter-Based Active Learning	✅	❌	❌	❌	❌	❌	✅	2
Dictionary Learning Based on Sparse Distribution Tomography	✅	❌	✅	❌	❌	❌	✅	3
Differentiable Programs with Neural Libraries	✅	❌	✅	❌	❌	❌	✅	3
Differentially Private Chi-squared Test by Unit Circle Mechanism	✅	❌	❌	❌	❌	❌	✅	2
Differentially Private Clustering in High-Dimensional Euclidean Spaces	✅	❌	✅	❌	❌	❌	✅	3
Differentially Private Learning of Undirected Graphical Models Using Collective Graphical Models	✅	❌	❌	❌	❌	❌	❌	1
Differentially Private Ordinary Least Squares	✅	❌	✅	❌	❌	❌	✅	3
Differentially Private Submodular Maximization: Data Summarization in Disguise	✅	❌	✅	❌	❌	❌	✅	3
Discovering Discrete Latent Topics with Neural Variational Inference	✅	❌	✅	❌	❌	❌	✅	3
Dissipativity Theory for Nesterov’s Accelerated Method	❌	❌	❌	❌	❌	❌	❌	0
Distributed Batch Gaussian Process Optimization	✅	❌	✅	❌	✅	❌	✅	4
Distributed Mean Estimation with Limited Communication	❌	❌	✅	❌	❌	❌	✅	2
Distributed and Provably Good Seedings for k-Means in Constant Rounds	✅	❌	❌	❌	❌	❌	❌	1
Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition	✅	❌	❌	❌	❌	❌	❌	1
Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization	✅	❌	✅	❌	❌	❌	✅	3
Dropout Inference in Bayesian Neural Networks with Alpha-divergences	✅	❌	✅	❌	❌	❌	✅	3
Dual Iterative Hard Thresholding: From Non-convex Sparse Minimization to Non-smooth Concave Maximization	✅	❌	✅	✅	❌	❌	✅	4
Dual Supervised Learning	✅	❌	✅	✅	✅	❌	✅	5
Dueling Bandits with Weak Regret	✅	❌	✅	❌	❌	❌	✅	3
Dynamic Word Embeddings	❌	❌	✅	❌	❌	❌	✅	2
Efficient Distributed Learning with Sparsity	✅	❌	❌	✅	❌	❌	❌	2
Efficient Nonmyopic Active Search	❌	❌	✅	❌	❌	❌	✅	2
Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret	✅	❌	✅	❌	❌	❌	✅	3
Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections	✅	✅	✅	✅	❌	❌	✅	5
Efficient Regret Minimization in Non-Convex Games	✅	❌	❌	❌	❌	❌	❌	1
Efficient softmax approximation for GPUs	❌	✅	✅	❌	✅	❌	✅	4
Emulating the Expert: Inverse Optimization through Online Learning	✅	❌	❌	❌	✅	✅	❌	3
End-to-End Differentiable Adversarial Imitation Learning	✅	❌	✅	❌	❌	❌	✅	3
End-to-End Learning for Structured Prediction Energy Networks	❌	❌	✅	✅	❌	❌	✅	3
Enumerating Distinct Decision Trees	✅	✅	✅	✅	✅	❌	✅	6
Equivariance Through Parameter-Sharing	❌	❌	❌	❌	❌	❌	❌	0
Estimating individual treatment effect: generalization bounds and algorithms	✅	✅	✅	✅	❌	❌	✅	5
Estimating the unseen from multiple populations	✅	❌	✅	✅	❌	❌	✅	4
Evaluating Bayesian Models with Posterior Dispersion Indices	✅	✅	✅	❌	❌	❌	✅	4
Evaluating the Variance of Likelihood-Ratio Gradient Estimators	✅	❌	✅	✅	✅	❌	✅	5
Exact Inference for Integer Latent-Variable Models	✅	❌	❌	❌	❌	❌	✅	2
Exact MAP Inference by Avoiding Fractional Vertices	✅	❌	❌	❌	❌	❌	✅	2
Exploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms	✅	❌	✅	❌	❌	❌	✅	3
Failures of Gradient-Based Deep Learning	❌	✅	❌	❌	❌	❌	✅	2
Fairness in Reinforcement Learning	❌	❌	❌	❌	❌	❌	❌	0
Fake News Mitigation via Point Process Based Intervention	✅	❌	❌	❌	❌	❌	✅	2
Fast Bayesian Intensity Estimation for the Permanental Process	❌	❌	✅	❌	❌	❌	✅	2
Fast k-Nearest Neighbour Search via Prioritized DCI	✅	❌	✅	✅	❌	❌	✅	4
Faster Greedy MAP Inference for Determinantal Point Processes	✅	✅	✅	❌	✅	❌	✅	5
Faster Principal Component Regression and Stable Matrix Chebyshev Approximation	✅	❌	❌	❌	❌	❌	❌	1
FeUdal Networks for Hierarchical Reinforcement Learning	❌	❌	✅	❌	❌	❌	✅	2
Follow the Compressed Leader: Faster Online Learning of Eigenvectors and Faster MMWU	✅	❌	❌	❌	❌	❌	❌	1
Follow the Moving Leader in Deep Learning	✅	❌	✅	❌	❌	❌	✅	3
Forest-type Regression with General Losses and Robust Forest	✅	❌	✅	✅	❌	❌	✅	4
Forward and Reverse Gradient-Based Hyperparameter Optimization	✅	✅	✅	✅	✅	❌	✅	6
Fractional Langevin Monte Carlo: Exploring Levy Driven Stochastic Differential Equations for Markov Chain Monte Carlo	❌	❌	✅	❌	❌	❌	✅	2
Frame-based Data Factorizations	✅	❌	✅	✅	❌	❌	✅	4
From Patches to Images: A Nonparametric Generative Model	❌	✅	✅	❌	❌	❌	✅	3
GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization	✅	❌	❌	❌	❌	❌	✅	2
Generalization and Equilibrium in Generative Adversarial Nets (GANs)	❌	✅	✅	❌	❌	❌	✅	3
Geometry of Neural Network Loss Surfaces via Random Matrix Theory	❌	❌	✅	❌	❌	❌	✅	2
Global optimization of Lipschitz functions	✅	❌	✅	✅	❌	✅	✅	5
Globally Induced Forest: A Prepruning Compression Scheme	✅	✅	✅	✅	❌	✅	✅	6
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs	❌	❌	❌	❌	❌	❌	❌	0
Gradient Boosted Decision Trees for High Dimensional Sparse Output	✅	❌	✅	❌	✅	❌	✅	4
Gradient Coding: Avoiding Stragglers in Distributed Learning	✅	❌	✅	❌	✅	❌	❌	3
Gradient Projection Iterative Sketch for Large-Scale Constrained Least-Squares	✅	❌	✅	❌	✅	✅	✅	5
Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling	❌	❌	✅	✅	❌	❌	✅	3
Grammar Variational Autoencoder	✅	✅	✅	❌	❌	❌	❌	3
Graph-based Isometry Invariant Representation Learning	❌	❌	✅	✅	✅	❌	✅	4
Guarantees for Greedy Maximization of Non-submodular Functions with Applications	✅	✅	✅	❌	❌	✅	✅	5
Hierarchy Through Composition with Multitask LMDPs	✅	❌	✅	❌	❌	❌	✅	3
High Dimensional Bayesian Optimization with Elastic Gaussian Process	✅	❌	✅	❌	✅	❌	✅	4
High-Dimensional Structured Quantile Regression	❌	❌	❌	❌	❌	❌	✅	1
High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm	✅	❌	❌	❌	❌	❌	✅	2
High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation	❌	❌	❌	❌	❌	❌	✅	1
How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?	❌	❌	✅	❌	❌	❌	❌	1
How to Escape Saddle Points Efficiently	✅	❌	❌	❌	❌	❌	❌	1
Hyperplane Clustering via Dual Principal Component Pursuit	❌	❌	✅	❌	✅	❌	✅	3
Identification and Model Testing in Linear Structural Equation Models using Auxiliary Variables	✅	❌	❌	❌	❌	❌	❌	1
Identify the Nash Equilibrium in Static Games with Random Payoffs	✅	❌	❌	❌	❌	❌	❌	1
Identifying Best Interventions through Online Importance Sampling	✅	❌	✅	❌	❌	❌	✅	3
Image-to-Markup Generation with Coarse-to-Fine Attention	❌	✅	✅	✅	✅	❌	✅	5
Improved Variational Autoencoders for Text Modeling using Dilated Convolutions	❌	❌	✅	✅	❌	❌	✅	3
Improving Gibbs Sampler Scan Quality with DoGS	✅	❌	✅	❌	✅	❌	✅	4
Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution	❌	❌	✅	❌	❌	❌	✅	2
Improving Viterbi is Hard: Better Runtimes Imply Faster Clique Algorithms	✅	❌	❌	❌	❌	❌	❌	1
Innovation Pursuit: A New Approach to the Subspace Clustering Problem	✅	❌	✅	❌	❌	❌	✅	3
Input Convex Neural Networks	❌	✅	✅	❌	❌	❌	✅	3
Input Switched Affine Networks: An RNN Architecture Designed for Interpretability	❌	❌	✅	✅	❌	❌	❌	2
Interactive Learning from Policy-Dependent Human Feedback	✅	❌	❌	❌	❌	❌	✅	2
Iterative Machine Teaching	✅	❌	✅	❌	❌	❌	❌	2
Joint Dimensionality Reduction and Metric Learning: A Geometric Take	❌	✅	✅	❌	❌	❌	✅	3
Just Sort It! A Simple and Effective Approach to Active Preference Learning	✅	✅	✅	❌	✅	❌	❌	4
Kernelized Support Tensor Machines	✅	❌	✅	✅	❌	✅	✅	5
Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs	✅	❌	✅	❌	❌	❌	❌	2
Language Modeling with Gated Convolutional Networks	❌	❌	✅	❌	✅	❌	✅	3
Large-Scale Evolution of Image Classifiers	❌	✅	✅	✅	❌	❌	✅	4
Latent Feature Lasso	✅	❌	✅	❌	❌	❌	❌	2
Latent Intention Dialogue Models	❌	✅	✅	✅	❌	❌	✅	4
Latent LSTM Allocation: Joint Clustering and Non-Linear Dynamic Modeling of Sequence Data	✅	❌	✅	❌	❌	❌	✅	3
Lazifying Conditional Gradient Algorithms	✅	❌	❌	❌	❌	✅	❌	2
Learned Optimizers that Scale and Generalize	❌	❌	✅	❌	✅	❌	✅	3
Learning Algorithms for Active Learning	✅	❌	✅	✅	❌	❌	✅	4
Learning Continuous Semantic Representations of Symbolic Expressions	✅	✅	✅	✅	❌	❌	✅	5
Learning Deep Architectures via Generalized Whitened Neural Networks	✅	❌	✅	✅	❌	❌	✅	4
Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo	✅	❌	✅	❌	✅	❌	✅	4
Learning Determinantal Point Processes with Moments and Cycles	✅	❌	❌	❌	❌	❌	❌	1
Learning Discrete Representations via Information Maximizing Self-Augmented Training	❌	✅	✅	❌	❌	❌	✅	3
Learning Gradient Descent: Better Generalization and Longer Horizons	❌	✅	✅	✅	❌	❌	✅	4
Learning Hawkes Processes from Short Doubly-Censored Event Sequences	✅	❌	✅	❌	❌	❌	✅	3
Learning Hierarchical Features from Deep Generative Models	❌	✅	✅	❌	❌	❌	❌	2
Learning Important Features Through Propagating Activation Differences	❌	✅	✅	❌	❌	❌	✅	3
Learning Infinite Layer Networks Without the Kernel Trick	✅	❌	❌	❌	❌	❌	❌	1
Learning Latent Space Models with Angular Constraints	❌	❌	✅	✅	✅	❌	✅	4
Learning Sleep Stages from Radio Signals: A Conditional Adversarial Architecture	✅	❌	✅	✅	❌	❌	❌	3
Learning Stable Stochastic Nonlinear Dynamical Systems	❌	❌	✅	❌	✅	❌	✅	3
Learning Texture Manifolds with the Periodic Spatial GAN	❌	✅	✅	❌	✅	❌	✅	4
Learning from Clinical Judgments: Semi-Markov-Modulated Marked Hawkes Processes for Risk Prognosis	✅	❌	❌	❌	❌	❌	✅	2
Learning in POMDPs with Monte Carlo Tree Search	✅	❌	❌	❌	❌	❌	✅	2
Learning the Structure of Generative Models without Labeled Data	✅	✅	✅	❌	❌	❌	✅	4
Learning to Aggregate Ordinal Labels by Maximizing Separating Width	✅	✅	✅	❌	✅	❌	✅	5
Learning to Align the Source Code to the Compiled Object Code	❌	✅	✅	✅	❌	❌	✅	4
Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier	✅	✅	❌	✅	✅	❌	✅	5
Learning to Discover Cross-Domain Relations with Generative Adversarial Networks	❌	❌	✅	❌	✅	❌	✅	3
Learning to Discover Sparse Graphical Models	✅	❌	✅	✅	❌	❌	✅	4
Learning to Generate Long-term Future via Hierarchical Prediction	✅	❌	✅	✅	✅	❌	✅	5
Learning to Learn without Gradient Descent by Gradient Descent	❌	❌	✅	❌	❌	❌	✅	2
Leveraging Node Attributes for Incomplete Relational Data	❌	✅	✅	❌	✅	❌	✅	4
Leveraging Union of Subspace Structure to Improve Constrained Clustering	✅	❌	✅	❌	❌	❌	✅	3
Local Bayesian Optimization of Motor Skills	✅	❌	✅	❌	❌	❌	✅	3
Local-to-Global Bayesian Network Structure Learning	✅	❌	✅	❌	✅	❌	❌	3
Logarithmic Time One-Against-Some	✅	✅	✅	❌	❌	❌	✅	4
Lost Relatives of the Gumbel Trick	✅	✅	✅	❌	❌	❌	✅	4
MEC: Memory-efficient Convolution for Deep Neural Network	✅	❌	✅	❌	✅	❌	✅	4
Magnetic Hamiltonian Monte Carlo	✅	❌	❌	❌	❌	❌	✅	2
Max-value Entropy Search for Efficient Bayesian Optimization	✅	✅	✅	✅	✅	❌	✅	6
Maximum Selection and Ranking under Noisy Comparisons	✅	❌	❌	❌	❌	❌	✅	2
McGan: Mean and Covariance Feature Matching GAN	✅	❌	✅	❌	❌	❌	✅	3
Measuring Sample Quality with Kernels	❌	✅	✅	❌	✅	❌	✅	4
Meritocratic Fairness for Cross-Population Selection	✅	❌	❌	❌	❌	❌	✅	2
Meta Networks	✅	✅	✅	✅	❌	❌	✅	5
Minimax Regret Bounds for Reinforcement Learning	✅	❌	❌	❌	❌	❌	❌	1
Minimizing Trust Leaks for Robust Sybil Detection	❌	❌	✅	❌	❌	❌	✅	2
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks	✅	✅	✅	✅	❌	❌	✅	5
Model-Independent Online Learning for Influence Maximization	✅	❌	✅	✅	❌	❌	✅	4
Modular Multitask Reinforcement Learning with Policy Sketches	✅	✅	❌	❌	❌	❌	✅	3
Multi-Class Optimal Margin Distribution Machine	✅	❌	✅	✅	✅	✅	✅	6
Multi-fidelity Bayesian Optimisation with Continuous Approximations	✅	❌	✅	✅	❌	❌	✅	4
Multi-objective Bandits: Optimizing the Generalized Gini Index	✅	❌	❌	❌	❌	❌	✅	2
Multi-task Learning with Labeled and Unlabeled Tasks	✅	❌	✅	✅	❌	❌	✅	4
Multichannel End-to-end Speech Recognition	❌	❌	✅	✅	❌	❌	✅	3
Multilabel Classification with Group Testing and Codes	✅	❌	✅	❌	❌	❌	❌	2
Multilevel Clustering via Wasserstein Means	✅	✅	✅	❌	✅	❌	✅	5
Multiple Clustering Views from Multiple Uncertain Experts	❌	❌	✅	✅	❌	❌	✅	3
Multiplicative Normalizing Flows for Variational Bayesian Neural Networks	✅	❌	✅	❌	❌	❌	✅	3
Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter	✅	❌	❌	❌	❌	❌	❌	1
Near-Optimal Design of Experiments via Regret Minimization	✅	❌	❌	❌	❌	❌	✅	2
Nearly Optimal Robust Matrix Completion	✅	❌	❌	❌	❌	❌	✅	2
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders	❌	❌	✅	❌	❌	❌	✅	2
Neural Episodic Control	✅	❌	✅	✅	❌	❌	✅	4
Neural Message Passing for Quantum Chemistry	❌	❌	✅	✅	❌	❌	✅	3
Neural Networks and Rational Functions	❌	❌	❌	❌	❌	❌	❌	0
Neural Optimizer Search with Reinforcement Learning	❌	❌	✅	✅	❌	❌	✅	3
Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks	❌	❌	✅	❌	✅	❌	✅	3
No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis	❌	❌	❌	❌	❌	❌	❌	0
Nonnegative Matrix Factorization for Time Series Recovery From a Few Temporal Aggregates	✅	❌	✅	✅	❌	❌	✅	4
Nonparanormal Information Estimation	❌	✅	❌	❌	❌	❌	✅	2
Nyström Method with Kernel K-means++ Samples as Landmarks	❌	❌	✅	❌	❌	❌	✅	2
On Approximation Guarantees for Greedy Low Rank Optimization	✅	❌	✅	❌	❌	❌	❌	2
On Calibration of Modern Neural Networks	❌	❌	✅	✅	❌	❌	❌	2
On Context-Dependent Clustering of Bandits	✅	❌	✅	✅	❌	❌	✅	4
On Kernelized Multi-armed Bandits	✅	❌	✅	❌	❌	❌	✅	3
On Mixed Memberships and Symmetric Nonnegative Matrix Factorizations	✅	❌	✅	❌	❌	❌	✅	3
On Relaxing Determinism in Arithmetic Circuits	❌	❌	❌	❌	❌	❌	❌	0
On The Projection Operator to A Three-view Cardinality Constrained Set	✅	❌	❌	❌	❌	❌	✅	2
On orthogonality and learning recurrent networks with long term dependencies	❌	❌	✅	✅	❌	❌	✅	3
On the Expressive Power of Deep Neural Networks	❌	❌	✅	❌	❌	❌	✅	2
On the Iteration Complexity of Support Recovery via Hard Thresholding Pursuit	❌	❌	❌	❌	❌	❌	✅	1
On the Sampling Problem for Kernel Quadrature	✅	❌	❌	❌	❌	❌	✅	2
Online Learning to Rank in Stochastic Click Models	✅	❌	✅	❌	❌	❌	✅	3
Online Learning with Local Permutations and Delayed Feedback	✅	❌	❌	❌	❌	❌	✅	2
Online Partial Least Square Optimization: Dropping Convexity for Better Efficiency and Scalability	❌	❌	✅	❌	❌	❌	✅	2
Online and Linear-Time Attention by Enforcing Monotonic Alignments	✅	✅	✅	✅	❌	❌	✅	5
OptNet: Differentiable Optimization as a Layer in Neural Networks	❌	✅	❌	❌	✅	❌	✅	3
Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks	✅	❌	❌	❌	❌	❌	✅	2
Optimal Densification for Fast and Accurate Minwise Hashing	✅	✅	✅	❌	✅	❌	❌	4
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits	❌	❌	✅	✅	❌	❌	✅	3
Oracle Complexity of Second-Order Methods for Finite-Sum Problems	❌	❌	❌	❌	❌	❌	❌	0
Ordinal Graphical Models: A Tale of Two Approaches	❌	❌	✅	✅	❌	❌	✅	3
Orthogonalized ALS: A Theoretically Principled Tensor Decomposition Algorithm for Practical Use	✅	✅	✅	❌	❌	❌	✅	4
Pain-Free Random Differential Privacy with Sensitivity Sampling	✅	❌	❌	❌	❌	❌	✅	2
Parallel Multiscale Autoregressive Density Estimation	❌	❌	✅	✅	✅	❌	✅	4
Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space	✅	❌	✅	❌	❌	❌	✅	3
Parseval Networks: Improving Robustness to Adversarial Examples	✅	❌	✅	✅	❌	❌	✅	4
Partitioned Tensor Factorizations for Learning Mixed Membership Models	✅	✅	✅	❌	✅	❌	✅	5
PixelCNN Models with Auxiliary Variables for Natural Image Modeling	❌	❌	✅	❌	✅	❌	✅	3
Post-Inference Prior Swapping	✅	❌	✅	❌	❌	❌	❌	2
Practical Gauss-Newton Optimisation for Deep Learning	❌	❌	✅	✅	✅	❌	✅	4
Prediction and Control with Temporal Segment Models	❌	❌	✅	❌	❌	❌	✅	2
Prediction under Uncertainty in Sparse Spectrum Gaussian Processes with Applications to Filtering and Control	✅	❌	❌	❌	❌	❌	✅	2
Preferential Bayesian Optimization	✅	❌	✅	❌	❌	❌	✅	3
Priv’IT: Private and Sample Efficient Identity Testing	✅	❌	❌	❌	✅	❌	✅	3
Probabilistic Path Hamiltonian Monte Carlo	✅	✅	✅	❌	❌	✅	✅	5
Probabilistic Submodular Maximization in Sub-Linear Time	✅	❌	✅	❌	✅	❌	✅	4
Programming with a Differentiable Forth Interpreter	✅	❌	✅	❌	❌	❌	❌	2
Projection-free Distributed Online Learning in Networks	✅	❌	✅	❌	❌	❌	✅	3
ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices	✅	✅	✅	✅	✅	❌	✅	6
Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations	✅	✅	✅	❌	❌	❌	✅	4
Provably Optimal Algorithms for Generalized Linear Contextual Bandits	✅	❌	❌	❌	❌	❌	❌	1
Prox-PDA: The Proximal Primal-Dual Algorithm for Fast Distributed Nonconvex Optimization and Learning Over Networks	✅	❌	❌	❌	✅	❌	✅	3
Random Feature Expansions for Deep Gaussian Processes	❌	❌	✅	❌	✅	❌	✅	3
Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees	❌	❌	❌	❌	❌	❌	❌	0
Re-revisiting Learning on Hypergraphs: Confidence Interval and Subgradient Method	✅	❌	✅	❌	❌	❌	✅	3
Real-Time Adaptive Image Compression	❌	❌	✅	❌	✅	❌	✅	3
Recovery Guarantees for One-hidden-layer Neural Networks	✅	❌	❌	❌	❌	❌	✅	2
Recurrent Highway Networks	❌	❌	✅	❌	❌	❌	❌	1
Recursive Partitioning for Personalization using Observational Data	✅	❌	✅	✅	❌	❌	✅	4
Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning	❌	❌	✅	❌	❌	❌	✅	2
Regret Minimization in Behaviorally-Constrained Zero-Sum Games	✅	❌	✅	❌	❌	❌	✅	3
Regularising Non-linear Models Using Feature Side-information	❌	❌	✅	✅	❌	❌	✅	3
Reinforcement Learning with Deep Energy-Based Policies	✅	✅	❌	❌	❌	❌	✅	3
Relative Fisher Information and Natural Gradient for Learning Large Modular Models	❌	✅	✅	❌	❌	❌	✅	3
Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things	✅	✅	✅	✅	✅	❌	✅	6
Risk Bounds for Transferring Representations With and Without Fine-Tuning	❌	❌	✅	❌	❌	❌	✅	2
Robust Adversarial Reinforcement Learning	✅	❌	✅	❌	❌	❌	✅	3
Robust Budget Allocation via Continuous Submodular Functions	✅	✅	✅	❌	❌	✅	✅	5
Robust Gaussian Graphical Model Estimation with Arbitrary Corruption	❌	❌	✅	✅	❌	❌	✅	3
Robust Guarantees of Stochastic Greedy Algorithms	✅	❌	✅	❌	❌	❌	✅	3
Robust Probabilistic Modeling with Bayesian Data Reweighting	❌	❌	✅	❌	❌	❌	✅	2
Robust Structured Estimation with Single-Index Models	❌	❌	❌	❌	❌	❌	✅	1
Robust Submodular Maximization: A Non-Uniform Partitioning Approach	✅	❌	✅	❌	❌	❌	✅	3
RobustFill: Neural Program Learning under Noisy I/O	❌	❌	✅	❌	✅	❌	✅	3
Rule-Enhanced Penalized Regression by Column Generation using Rectangular Maximum Agreement	✅	❌	✅	❌	✅	✅	✅	5
SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient	✅	❌	✅	❌	❌	❌	✅	3
SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling	❌	❌	✅	✅	❌	❌	✅	3
Safety-Aware Algorithms for Adversarial Contextual Bandit	✅	❌	❌	❌	❌	❌	❌	1
Scalable Bayesian Rule Lists	✅	✅	✅	✅	❌	❌	✅	5
Scalable Generative Models for Multi-label Learning with Missing Labels	❌	❌	✅	❌	✅	❌	✅	3
Scalable Multi-Class Gaussian Process Classification using Expectation Propagation	❌	✅	✅	❌	❌	❌	✅	3
Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction	✅	❌	✅	❌	✅	❌	✅	4
Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics	✅	❌	❌	❌	❌	❌	✅	2
Second-Order Kernel Online Convex Optimization with Adaptive Sketching	✅	❌	❌	❌	❌	❌	❌	1
Selective Inference for Sparse High-Order Interaction Models	❌	❌	✅	❌	❌	❌	✅	2
Self-Paced Co-training	✅	❌	✅	✅	❌	❌	✅	4
Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data	❌	❌	✅	✅	✅	❌	✅	4
Sequence Modeling via Segmentations	✅	❌	✅	❌	❌	❌	✅	3
Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control	❌	✅	✅	❌	❌	❌	✅	3
Sequence to Better Sequence: Continuous Revision of Combinatorial Structures	✅	❌	✅	❌	❌	❌	❌	2
Sharp Minima Can Generalize For Deep Nets	❌	❌	❌	❌	❌	❌	❌	0
Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation	✅	✅	✅	✅	❌	❌	✅	5
Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging	❌	❌	❌	❌	❌	❌	✅	1
Sliced Wasserstein Kernel for Persistence Diagrams	✅	❌	✅	✅	✅	❌	✅	5
Soft-DTW: a Differentiable Loss Function for Time-Series	✅	✅	✅	✅	❌	❌	✅	5
Source-Target Similarity Modelings for Multi-Source Transfer Gaussian Process Regression	❌	❌	✅	❌	❌	❌	✅	2
Sparse + Group-Sparse Dirty Models: Statistical Guarantees without Unreasonable Conditions and a Case for Non-Convexity	❌	❌	✅	✅	❌	❌	✅	3
Spectral Learning from a Single Trajectory under Finite-State Policies	✅	❌	❌	❌	❌	❌	❌	1
Spherical Structured Feature Maps for Kernel Approximation	✅	❌	✅	❌	❌	❌	✅	3
SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization	✅	✅	✅	✅	✅	❌	✅	6
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning	❌	❌	❌	❌	❌	❌	✅	1
State-Frequency Memory Recurrent Neural Networks	❌	❌	✅	✅	❌	❌	✅	3
Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening	❌	❌	✅	❌	❌	❌	❌	1
StingyCD: Safely Avoiding Wasteful Updates in Coordinate Descent	✅	❌	✅	❌	❌	❌	✅	3
Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values	✅	❌	❌	❌	✅	✅	✅	4
Stochastic Bouncy Particle Sampler	✅	✅	✅	❌	❌	❌	✅	4
Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence	✅	❌	✅	❌	❌	❌	✅	3
Stochastic DCA for the Large-sum of Non-convex Functions Problem and its Application to Group Variable Selection in Classification	✅	❌	✅	✅	✅	❌	✅	5
Stochastic Generative Hashing	✅	✅	✅	❌	✅	❌	✅	5
Stochastic Gradient MCMC Methods for Hidden Markov Models	✅	❌	✅	❌	❌	❌	✅	3
Stochastic Gradient Monomial Gamma Sampler	✅	❌	✅	✅	❌	❌	✅	4
Stochastic Modified Equations and Adaptive Stochastic Gradient Algorithms	✅	❌	✅	❌	❌	❌	✅	3
Stochastic Variance Reduction Methods for Policy Evaluation	✅	❌	✅	❌	❌	❌	✅	3
Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions	❌	❌	❌	❌	❌	❌	❌	0
Strongly-Typed Agents are Guaranteed to Interact Safely	❌	❌	❌	❌	❌	❌	❌	0
Sub-sampled Cubic Regularization for Non-convex Optimization	✅	❌	✅	❌	❌	❌	❌	2
Tensor Balancing on Statistical Manifold	❌	✅	✅	❌	✅	✅	✅	5
Tensor Belief Propagation	✅	❌	✅	❌	✅	✅	✅	5
Tensor Decomposition via Simultaneous Power Iteration	✅	❌	❌	❌	❌	❌	❌	1
Tensor Decomposition with Smoothness	❌	❌	✅	❌	❌	❌	✅	2
Tensor-Train Recurrent Neural Networks for Video Classification	❌	✅	✅	✅	✅	❌	✅	5
The Loss Surface of Deep and Wide Neural Networks	❌	❌	❌	❌	❌	❌	❌	0
The Predictron: End-To-End Learning and Planning	❌	❌	❌	❌	❌	❌	✅	1
The Price of Differential Privacy for Online Learning	✅	❌	❌	❌	❌	❌	❌	1
The Sample Complexity of Online One-Class Collaborative Filtering	✅	❌	✅	❌	❌	❌	❌	2
The Shattered Gradients Problem: If resnets are the answer, then what is the question?	❌	❌	✅	❌	❌	❌	✅	2
The Statistical Recurrent Unit	❌	✅	✅	✅	❌	❌	✅	4
Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank	❌	❌	❌	❌	❌	❌	❌	0
Tight Bounds for Approximate Carathéodory and Beyond	✅	❌	❌	❌	❌	❌	❌	1
Toward Controlled Generation of Text	✅	❌	✅	✅	❌	❌	✅	4
Toward Efficient and Accurate Covariance Matrix Estimation on Compressed Data	✅	❌	✅	❌	✅	❌	✅	4
Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering	✅	✅	✅	❌	❌	❌	✅	4
Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs	✅	✅	✅	✅	❌	❌	✅	5
Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference	❌	❌	✅	❌	❌	❌	❌	1
Uncorrelation and Evenness: a New Diversity-Promoting Regularizer	❌	❌	✅	✅	❌	❌	✅	3
Uncovering Causality from Multivariate Hawkes Integrated Cumulants	✅	✅	✅	❌	❌	❌	✅	4
Understanding Black-box Predictions via Influence Functions	❌	✅	✅	✅	❌	❌	✅	4
Understanding Synthetic Gradients and Decoupled Neural Interfaces	❌	❌	✅	❌	❌	❌	✅	2
Understanding the Representation and Computation of Multilayer Perceptrons: A Case Study in Speech Recognition	❌	❌	✅	✅	❌	❌	✅	3
Uniform Convergence Rates for Kernel Density Estimation	❌	❌	❌	❌	❌	❌	❌	0
Uniform Deviation Bounds for k-Means Clustering	❌	❌	❌	❌	❌	❌	❌	0
Unifying Task Specification in Reinforcement Learning	✅	❌	❌	❌	❌	❌	✅	2
Unimodal Probability Distributions for Deep Ordinal Classification	❌	✅	✅	✅	❌	❌	✅	4
Unsupervised Learning by Predicting Noise	✅	❌	✅	✅	❌	❌	✅	4
Variants of RMSProp and Adagrad with Logarithmic Regret Bounds	✅	❌	✅	❌	❌	❌	✅	3
Variational Boosting: Iteratively Refining Posterior Approximations	✅	✅	✅	❌	❌	❌	✅	4
Variational Dropout Sparsifies Deep Neural Networks	❌	✅	✅	❌	❌	❌	✅	3
Variational Inference for Sparse and Undirected Models	✅	❌	❌	❌	❌	❌	❌	1
Variational Policy for Guiding Point Processes	✅	❌	✅	❌	❌	❌	✅	3
Video Pixel Networks	❌	❌	✅	✅	❌	❌	✅	3
Warped Convolutions: Efficient Invariance to Spatial Transformations	✅	❌	✅	✅	❌	❌	✅	4
Wasserstein Generative Adversarial Networks	✅	❌	✅	❌	❌	❌	✅	3
When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, $\ell_2$-consistency and Neuroscience Applications	❌	✅	✅	❌	❌	❌	✅	3
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?	✅	❌	❌	❌	❌	❌	✅	2
World of Bits: An Open-Domain Platform for Web-Based Agents	❌	❌	❌	❌	❌	❌	✅	1
Zero-Inflated Exponential Family Embeddings	❌	❌	✅	✅	❌	❌	✅	3
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning	✅	❌	❌	❌	❌	❌	✅	2
ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning	❌	❌	✅	✅	❌	❌	✅	3
Zonotope Hit-and-run for Efficient Sampling from Projection DPPs	✅	❌	✅	❌	❌	❌	✅	3
iSurvive: An Interpretable, Event-time Prediction Model for mHealth	✅	✅	❌	✅	❌	❌	❌	3
meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting	❌	❌	✅	✅	✅	❌	✅	4
“Convex Until Proven Guilty”: Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions	✅	❌	✅	❌	❌	❌	✅	3