Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

International Conference on Machine Learning (ICML) - 2016

Documentation Rate of Empirical Papers by Reproducibility Variable

Distribution of Empirical Papers by Number of Documented Variables

Website:

Venue	Year	Papers	Reproducibility Score Reproducibility Score based on Gundersen et al. (2025). See Methods for details.	Documentation Score Documentation Score is the average score over the seven reproducibility variables for empirical research papers. See Methods for details.	% Empirical Percentage of papers that are empirical research vs theoretical research.	% Industry Percentage of empirical research papers with at least one author from Industry.	Website
ICML	2016	322	0.36	3.07	93.17%	33.0%

Search Papers

	Pseudocode	Open Source Code	Open Datasets	Dataset Splits	Hardware Specification	Software Dependencies	Experiment Setup
A Box-Constrained Approach for Hard Permutation Problems	✅	❌	✅	❌	❌	✅	✅	4
A Comparative Analysis and Study of Multiview CNN Models for Joint Object Categorization and Pose Estimation	❌	❌	✅	✅	❌	❌	❌	2
A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery	✅	❌	❌	❌	❌	❌	✅	2
A Convolutional Attention Network for Extreme Summarization of Source Code	✅	✅	✅	✅	❌	❌	✅	5
A Deep Learning Approach to Unsupervised Ensemble Learning	❌	✅	✅	❌	❌	❌	✅	3
A Distributed Variational Inference Framework for Unifying Parallel Sparse Gaussian Process Regression Models	❌	❌	✅	❌	✅	❌	✅	3
A Kernel Test of Goodness of Fit	❌	✅	❌	❌	❌	❌	✅	2
A Kernelized Stein Discrepancy for Goodness-of-fit Tests	✅	❌	❌	❌	❌	❌	✅	2
A Kronecker-factored approximate Fisher matrix for convolution layers	❌	❌	✅	❌	✅	❌	✅	3
A Neural Autoregressive Approach to Collaborative Filtering	❌	✅	✅	✅	✅	❌	✅	5
A New PAC-Bayesian Perspective on Domain Adaptation	❌	❌	✅	✅	❌	❌	✅	3
A Random Matrix Approach to Echo-State Neural Networks	❌	❌	✅	❌	❌	❌	✅	2
A Self-Correcting Variable-Metric Algorithm for Stochastic Optimization	✅	✅	✅	❌	✅	✅	✅	6
A Simple and Provable Algorithm for Sparse Diagonal CCA	✅	❌	✅	❌	✅	❌	✅	4
A Simple and Strongly-Local Flow-Based Method for Cut Improvement	✅	❌	✅	❌	❌	❌	❌	2
A Subspace Learning Approach for High Dimensional Matrix Decomposition with Efficient Column/Row Sampling	✅	❌	✅	❌	❌	❌	❌	2
A Superlinearly-Convergent Proximal Newton-type Method for the Optimization of Finite Sums	✅	❌	✅	❌	❌	❌	✅	3
A Theory of Generative ConvNet	✅	✅	✅	❌	❌	❌	✅	4
A Variational Analysis of Stochastic Gradient Algorithms	❌	❌	✅	❌	❌	❌	✅	2
A ranking approach to global optimization	✅	❌	❌	❌	❌	❌	✅	2
ADIOS: Architectures Deep In Output Space	✅	✅	✅	✅	❌	❌	✅	5
Accurate Robust and Efficient Error Estimation for Decision Trees	❌	❌	✅	✅	❌	❌	❌	2
Actively Learning Hemimetrics with Applications to Eliciting User Preferences	✅	❌	✅	❌	❌	❌	✅	3
Adaptive Algorithms for Online Convex Optimization with Long-term Constraints	✅	❌	✅	❌	❌	❌	✅	3
Adaptive Sampling for SGD by Exploiting Side Information	✅	❌	✅	✅	❌	❌	✅	4
Additive Approximations in High Dimensional Nonparametric Regression via the SALSA	❌	✅	✅	✅	❌	❌	✅	4
Algorithms for Optimizing the Ratio of Submodular Functions	✅	❌	❌	❌	❌	❌	✅	2
An optimal algorithm for the Thresholding Bandit Problem	✅	❌	❌	❌	❌	❌	✅	2
Analysis of Deep Neural Networks with Extended Data Jacobian Matrix	❌	❌	✅	✅	❌	❌	✅	3
Analysis of Variational Bayesian Factorizations for Sparse and Low-Rank Estimation	❌	❌	❌	❌	❌	❌	✅	1
Anytime Exploration for Multi-armed Bandits using Confidence Information	✅	❌	✅	❌	❌	❌	✅	3
Anytime optimal algorithms in stochastic multi-armed bandits	✅	❌	❌	❌	❌	❌	✅	2
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing	❌	❌	✅	✅	❌	❌	✅	3
Associative Long Short-Term Memory	❌	❌	✅	❌	❌	❌	✅	2
Asymmetric Multi-task Learning Based on Task Relatedness and Loss	✅	❌	✅	✅	❌	❌	❌	3
Asynchronous Methods for Deep Reinforcement Learning	✅	❌	✅	❌	✅	❌	✅	4
Augmenting Supervised Neural Networks with Unsupervised Objectives for Large-scale Image Classification	❌	✅	✅	✅	✅	❌	✅	5
Autoencoding beyond pixels using a learned similarity metric	✅	✅	✅	❌	❌	❌	✅	4
Automatic Construction of Nonparametric Relational Regression Models for Multiple Time Series	✅	❌	✅	❌	❌	❌	❌	2
Auxiliary Deep Generative Models	❌	✅	✅	✅	✅	❌	✅	5
BASC: Applying Bayesian Optimization to the Search for Global Minima on Potential Energy Surfaces	❌	✅	✅	❌	❌	✅	✅	4
BISTRO: An Efficient Relaxation-Based Method for Contextual Bandits	✅	❌	❌	❌	❌	❌	❌	1
Barron and Cover’s Theory in Supervised Learning and its Application to Lasso	❌	❌	❌	❌	❌	❌	✅	1
Bayesian Poisson Tucker Decomposition for Learning the Structure of International Relations	❌	❌	✅	❌	❌	❌	✅	2
Benchmarking Deep Reinforcement Learning for Continuous Control	❌	✅	✅	❌	❌	❌	✅	3
Beyond CCA: Moment Matching for Multi-View Models	❌	✅	✅	❌	❌	❌	✅	3
Beyond Parity Constraints: Fourier Analysis of Hash Functions for Inference	✅	❌	❌	❌	❌	❌	✅	2
Bidirectional Helmholtz Machines	✅	✅	✅	❌	✅	❌	✅	5
Binary embeddings with structured hashed projections	❌	❌	✅	❌	❌	❌	✅	2
Black-Box Alpha Divergence Minimization	❌	✅	✅	✅	❌	❌	✅	4
Black-box Optimization with a Politician	✅	❌	✅	❌	❌	❌	✅	3
Boolean Matrix Factorization and Noisy Completion via Message Passing	✅	✅	✅	❌	❌	❌	✅	4
Bounded Off-Policy Evaluation with Missing Data for Course Recommendation and Curriculum Design	✅	❌	❌	❌	❌	❌	✅	2
Clustering High Dimensional Categorical Data via Topographical Features	✅	❌	✅	❌	❌	❌	✅	3
Collapsed Variational Inference for Sum-Product Networks	✅	❌	✅	✅	✅	❌	✅	5
Community Recovery in Graphs with Locality	✅	❌	✅	❌	✅	❌	✅	4
Complex Embeddings for Simple Link Prediction	❌	❌	✅	✅	❌	❌	✅	3
Compressive Spectral Clustering	✅	✅	✅	❌	✅	✅	✅	6
Computationally Efficient Nyström Approximation using Fast Transforms	✅	❌	✅	❌	❌	❌	✅	3
Conditional Bernoulli Mixtures for Multi-label Classification	✅	✅	✅	✅	❌	❌	✅	5
Conditional Dependence via Shannon Capacity: Axioms, Estimators and Applications	❌	❌	✅	❌	❌	❌	✅	2
Conservative Bandits	✅	❌	❌	❌	❌	❌	✅	2
Contextual Combinatorial Cascading Bandits	✅	❌	✅	❌	❌	❌	✅	3
Continuous Deep Q-Learning with Model-based Acceleration	✅	❌	❌	❌	❌	❌	✅	2
Control of Memory, Active Perception, and Action in Minecraft	❌	❌	❌	❌	❌	❌	✅	1
Controlling the distance to a Kemeny consensus without computing it	✅	❌	✅	❌	❌	❌	❌	2
Convergence of Stochastic Gradient Descent for PCA	✅	❌	❌	❌	❌	❌	❌	1
Convolutional Rectifier Networks as Generalized Tensor Decompositions	❌	❌	❌	❌	❌	❌	❌	0
Copeland Dueling Bandit Problem: Regret Lower Bound, Optimal Algorithm, and Computationally Efficient Algorithm	✅	❌	✅	❌	❌	❌	✅	3
Correcting Forecasts with Multifactor Neural Attention	❌	❌	❌	✅	❌	❌	✅	2
Correlation Clustering and Biclustering with Locally Bounded Errors	✅	❌	❌	❌	❌	❌	❌	1
Cross-Graph Learning of Multi-Relational Associations	✅	❌	✅	✅	❌	❌	✅	4
CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy	❌	❌	✅	❌	✅	❌	✅	3
Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control	✅	✅	❌	❌	❌	❌	✅	3
DCM Bandits: Learning to Rank with Multiple Clicks	✅	❌	✅	❌	❌	❌	❌	2
DR-ABC: Approximate Bayesian Computation with Kernel-Based Distribution Regression	✅	✅	✅	✅	❌	❌	✅	5
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning	✅	❌	✅	❌	❌	❌	❌	2
Data-driven Rank Breaking for Efficient Rank Aggregation	❌	❌	✅	❌	❌	❌	✅	2
Dealbreaker: A Nonlinear Latent Variable Model for Educational Data	❌	❌	✅	❌	✅	❌	✅	3
Deconstructing the Ladder Network Architecture	❌	❌	✅	❌	❌	❌	✅	2
Deep Gaussian Processes for Regression using Approximate Expectation Propagation	❌	✅	✅	❌	❌	❌	✅	3
Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin	❌	❌	✅	❌	✅	❌	✅	3
Deep Structured Energy Based Models for Anomaly Detection	❌	❌	✅	✅	❌	❌	❌	2
Dictionary Learning for Massive Matrix Factorization	✅	✅	✅	✅	✅	❌	✅	6
Differential Geometric Regularization for Supervised Learning of Classifiers	✅	❌	✅	✅	❌	❌	✅	4
Differentially Private Chi-Squared Hypothesis Testing: Goodness of Fit and Independence Testing	✅	❌	❌	❌	❌	❌	✅	2
Differentially Private Policy Evaluation	✅	❌	❌	❌	❌	❌	✅	2
Dirichlet Process Mixture Model for Correcting Technical Variation in Single-Cell Gene Expression Data	❌	❌	✅	❌	❌	❌	✅	2
Discrete Deep Feature Extraction: A Theory and New Architectures	❌	✅	✅	✅	❌	❌	✅	4
Discrete Distribution Estimation under Local Privacy	❌	❌	❌	❌	❌	❌	✅	1
Discriminative Embeddings of Latent Variable Models for Structured Data	✅	✅	✅	✅	✅	❌	❌	5
Distributed Clustering of Linear Bandits in Peer to Peer Networks	✅	❌	❌	❌	❌	❌	❌	1
Diversity-Promoting Bayesian Learning of Latent Variable Models	❌	❌	✅	✅	❌	❌	✅	3
Domain Adaptation with Conditional Transferable Components	❌	❌	✅	✅	❌	❌	✅	3
Doubly Decomposing Nonparametric Tensor Regression	✅	❌	✅	❌	❌	❌	✅	3
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning	❌	❌	✅	✅	❌	❌	✅	3
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning	❌	✅	✅	✅	❌	❌	✅	4
Dropout distillation	✅	❌	✅	❌	❌	❌	✅	3
Dueling Network Architectures for Deep Reinforcement Learning	✅	❌	✅	❌	❌	❌	✅	3
Dynamic Capacity Networks	❌	❌	✅	✅	✅	❌	✅	4
Dynamic Memory Networks for Visual and Textual Question Answering	✅	❌	✅	✅	❌	❌	✅	4
Early and Reliable Event Detection Using Proximity Space Representation	✅	✅	✅	✅	✅	❌	✅	6
Efficient Algorithms for Adversarial Contextual Learning	✅	❌	❌	❌	❌	❌	❌	1
Efficient Algorithms for Large-scale Generalized Eigenvector Computation and Canonical Correlation Analysis	✅	❌	✅	❌	❌	❌	✅	3
Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity	✅	❌	✅	✅	✅	❌	✅	5
Efficient Multi-Instance Learning for Activity Recognition from Time Series Data Using an Auto-Regressive Hidden Markov Model	❌	❌	✅	✅	❌	❌	✅	3
Efficient Private Empirical Risk Minimization for High-dimensional Learning	✅	❌	❌	❌	❌	❌	❌	1
Energetic Natural Gradient Descent	❌	❌	✅	❌	❌	❌	✅	2
Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling	✅	❌	✅	❌	✅	❌	✅	4
Epigraph projections for fast general convex programming	✅	✅	✅	❌	❌	❌	❌	3
Estimating Accuracy from Unlabeled Data: A Bayesian Approach	❌	✅	✅	✅	❌	❌	✅	4
Estimating Cosmological Parameters from the Dark Matter Distribution	❌	❌	❌	✅	✅	❌	✅	3
Estimating Maximum Expected Value through Gaussian Approximation	✅	❌	❌	❌	❌	❌	✅	2
Estimating Structured Vector Autoregressive Models	❌	❌	✅	✅	❌	❌	✅	3
Estimation from Indirect Supervision with Linear Moments	❌	❌	✅	❌	❌	❌	✅	2
Evasion and Hardening of Tree Ensemble Classifiers	✅	❌	✅	❌	✅	✅	✅	5
Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling	✅	❌	✅	❌	❌	❌	❌	2
Exact Exponent in Optimal Rates for Crowdsourcing	❌	❌	❌	❌	❌	❌	❌	0
Experimental Design on a Budget for Sparse Linear Models and Applications	✅	✅	✅	❌	✅	❌	✅	5
Exploiting Cyclic Symmetry in Convolutional Neural Networks	❌	✅	✅	✅	❌	❌	✅	4
Expressiveness of Rectifier Networks	❌	❌	❌	✅	❌	❌	✅	2
Extended and Unscented Kitchen Sinks	❌	❌	✅	✅	❌	❌	✅	3
Extreme F-measure Maximization using Sparse Probability Estimates	✅	✅	✅	✅	❌	❌	✅	5
Factored Temporal Sigmoid Belief Networks for Sequence Learning	❌	❌	✅	❌	❌	❌	✅	2
False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking	✅	✅	✅	❌	❌	❌	✅	4
Fast Algorithms for Segmented Regression	✅	❌	❌	❌	✅	✅	✅	4
Fast Constrained Submodular Maximization: Personalized Data Summarization	✅	❌	✅	❌	❌	❌	✅	3
Fast DPP Sampling for Nystrom with Application to Kernel Methods	✅	❌	✅	✅	❌	❌	✅	4
Fast Parameter Inference in Nonlinear Dynamical Systems using Iterative Gradient Matching	❌	❌	✅	✅	❌	❌	✅	3
Fast Rate Analysis of Some Stochastic Optimization Algorithms	❌	❌	❌	❌	❌	❌	✅	1
Fast Stochastic Algorithms for SVD and PCA: Convergence Properties and Convexity	✅	❌	❌	❌	❌	❌	❌	1
Fast k-Nearest Neighbour Search via Dynamic Continuous Indexing	✅	❌	✅	❌	❌	❌	✅	3
Fast k-means with accurate bounds	❌	✅	✅	❌	✅	❌	❌	3
Fast methods for estimating the Numerical rank of large matrices	✅	✅	✅	❌	✅	❌	❌	4
Faster Convex Optimization: Simulated Annealing with an Efficient Universal Barrier	✅	❌	❌	❌	❌	❌	❌	1
Faster Eigenvector Computation via Shift-and-Invert Preconditioning	❌	❌	❌	❌	❌	❌	❌	0
Fixed Point Quantization of Deep Convolutional Networks	❌	❌	✅	❌	❌	❌	✅	2
ForecastICU: A Prognostic Decision Support System for Timely Prediction of Intensive Care Unit Admission	❌	❌	❌	✅	❌	❌	✅	2
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification	✅	❌	✅	✅	❌	❌	✅	4
Gaussian process nonparametric tensor estimator and its minimax optimality	❌	❌	✅	❌	❌	❌	✅	2
Gaussian quadrature for matrix inverse forms with applications	✅	❌	✅	❌	❌	❌	✅	3
Generalization Properties and Implicit Regularization for Multiple Passes SGM	✅	✅	✅	✅	✅	❌	✅	6
Generalization and Exploration via Randomized Value Functions	✅	❌	❌	❌	❌	❌	✅	2
Generalized Direct Change Estimation in Ising Model Structure	✅	❌	❌	❌	❌	❌	✅	2
Generative Adversarial Text to Image Synthesis	✅	❌	✅	✅	❌	❌	✅	4
Geometric Mean Metric Learning	✅	❌	✅	✅	✅	✅	✅	6
Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions	✅	❌	✅	❌	❌	❌	✅	3
Graying the black box: Understanding DQNs	❌	❌	✅	❌	❌	❌	✅	2
Greedy Column Subset Selection: New Bounds and Distributed Algorithms	✅	❌	✅	❌	❌	❌	✅	3
Gromov-Wasserstein Averaging of Kernel and Distance Matrices	✅	✅	✅	✅	❌	❌	✅	5
Group Equivariant Convolutional Networks	❌	❌	✅	✅	❌	❌	✅	3
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization	✅	❌	❌	❌	❌	❌	✅	2
Hawkes Processes with Stochastic Excitations	✅	❌	❌	❌	❌	❌	❌	1
Heteroscedastic Sequences: Beyond Gaussianity	✅	❌	❌	❌	❌	❌	✅	2
Hierarchical Compound Poisson Factorization	✅	❌	✅	✅	❌	❌	✅	4
Hierarchical Decision Making In Electricity Grid Management	✅	✅	✅	❌	❌	❌	✅	4
Hierarchical Span-Based Conditional Random Fields for Labeling and Segmenting Events in Wearable Sensor Data Streams	✅	❌	✅	✅	❌	❌	✅	4
Hierarchical Variational Models	✅	✅	✅	❌	❌	❌	✅	4
Horizontally Scalable Submodular Maximization	✅	❌	✅	❌	❌	❌	✅	3
How to Fake Multiply by a Gaussian Matrix	✅	✅	✅	✅	❌	❌	✅	5
Hyperparameter optimization with approximate gradient	✅	✅	✅	✅	❌	❌	✅	5
Importance Sampling Tree for Large-scale Empirical Expectation	❌	❌	✅	✅	❌	❌	✅	3
Improved SVRG for Non-Strongly-Convex or Sum-of-Non-Convex Objectives	✅	❌	✅	❌	❌	❌	✅	3
Inference Networks for Sequential Monte Carlo in Graphical Models	❌	❌	✅	❌	❌	❌	❌	1
Interacting Particle Markov Chain Monte Carlo	✅	✅	❌	❌	❌	❌	✅	3
Interactive Bayesian Hierarchical Clustering	❌	❌	✅	❌	❌	❌	✅	2
Isotonic Hawkes Processes	✅	❌	✅	❌	❌	❌	❌	2
K-Means Clustering with Distributed Dimensions	✅	❌	✅	❌	❌	❌	✅	3
L1-regularized Neural Networks are Improperly Learnable in Polynomial Time	✅	❌	✅	✅	❌	❌	✅	4
Large-Margin Softmax Loss for Convolutional Neural Networks	❌	❌	✅	✅	❌	❌	✅	3
Learning Convolutional Neural Networks for Graphs	✅	❌	✅	✅	✅	❌	✅	5
Learning End-to-end Video Classification with Rank-Pooling	❌	❌	✅	✅	✅	❌	✅	4
Learning Granger Causality for Hawkes Processes	✅	❌	✅	❌	❌	❌	✅	3
Learning Mixtures of Plackett-Luce Models	✅	❌	❌	❌	✅	✅	✅	4
Learning Physical Intuition of Block Towers by Example	❌	✅	❌	✅	❌	❌	✅	3
Learning Population-Level Diffusions with Generative RNNs	❌	✅	✅	❌	❌	❌	✅	3
Learning Representations for Counterfactual Inference	✅	❌	✅	✅	❌	❌	✅	4
Learning Simple Algorithms from Examples	❌	❌	❌	❌	❌	❌	✅	1
Learning Sparse Combinatorial Representations via Two-stage Submodular Maximization	✅	❌	✅	❌	❌	❌	✅	3
Learning and Inference via Maximum Inner Product Search	✅	❌	✅	❌	❌	❌	✅	3
Learning from Multiway Data: Simple and Efficient Tensor Regression	✅	❌	✅	✅	❌	❌	✅	4
Learning privately from multiparty data	✅	❌	✅	❌	❌	❌	✅	3
Learning to Filter with Predictive State Inference Machines	✅	❌	❌	✅	❌	❌	✅	3
Learning to Generate with Memory	❌	✅	✅	✅	❌	❌	✅	4
Linking losses for density ratio and class-probability estimation	❌	✅	✅	✅	❌	❌	✅	4
Loss factorization, weakly supervised learning and label noise robustness	✅	❌	❌	✅	❌	❌	✅	3
Low-Rank Matrix Approximation with Stability	✅	✅	✅	❌	❌	❌	✅	4
Low-rank Solutions of Linear Matrix Equations via Procrustes Flow	✅	❌	❌	❌	❌	❌	❌	1
Low-rank tensor completion: a Riemannian manifold preconditioning approach	❌	✅	✅	✅	✅	❌	✅	5
Markov Latent Feature Models	✅	❌	✅	✅	❌	❌	✅	4
Markov-modulated Marked Poisson Processes for Check-in Data	❌	❌	✅	❌	❌	❌	✅	2
Matrix Eigen-decomposition via Doubly Stochastic Riemannian Optimization	✅	❌	✅	❌	❌	❌	✅	3
Meta-Learning with Memory-Augmented Neural Networks	❌	❌	✅	❌	❌	❌	✅	2
Metadata-conscious anonymous messaging	✅	❌	✅	❌	❌	❌	✅	3
Meta–Gradient Boosted Decision Tree Model for Weight and Target Learning	✅	❌	❌	✅	❌	❌	✅	3
Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs	✅	✅	✅	❌	❌	❌	✅	4
Minimizing the Maximal Loss: How and Why	✅	❌	❌	❌	❌	❌	✅	2
Minimum Regret Search for Single- and Multi-Task Optimization	❌	✅	❌	❌	❌	❌	✅	2
Mixing Rates for the Alternating Gibbs Sampler over Restricted Boltzmann Machines and Friends	❌	❌	❌	❌	❌	❌	❌	0
Mixture Proportion Estimation via Kernel Embeddings of Distributions	✅	✅	✅	❌	❌	❌	✅	4
Model-Free Imitation Learning with Policy Optimization	✅	❌	✅	❌	❌	❌	❌	2
Model-Free Trajectory Optimization for Reinforcement Learning	✅	❌	❌	❌	❌	❌	✅	2
Multi-Bias Non-linear Activation in Deep Neural Networks	❌	✅	✅	✅	❌	❌	✅	4
Multi-Player Bandits – a Musical Chairs Approach	✅	❌	❌	❌	❌	❌	✅	2
Near Optimal Behavior via Approximate State Abstraction	❌	✅	❌	❌	❌	❌	✅	2
Network Morphism	✅	❌	✅	✅	❌	❌	✅	4
Neural Variational Inference for Text Processing	❌	❌	✅	✅	❌	❌	✅	3
No Oops, You Won’t Do It Again: Mechanisms for Self-correction in Crowdsourcing	✅	❌	✅	✅	❌	❌	✅	4
No penalty no tears: Least squares in high-dimensional linear models	✅	❌	❌	✅	❌	❌	✅	3
No-Regret Algorithms for Heavy-Tailed Linear Bandits	✅	❌	❌	❌	❌	❌	✅	2
Noisy Activation Functions	✅	✅	✅	❌	❌	❌	✅	4
Non-negative Matrix Factorization under Heavy Noise	✅	❌	✅	❌	✅	❌	✅	4
Nonlinear Statistical Learning with Truncated Gaussian Graphical Models	✅	❌	✅	✅	❌	❌	✅	4
Nonparametric Canonical Correlation Analysis	✅	❌	✅	✅	✅	❌	✅	5
Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks	❌	❌	✅	✅	✅	❌	✅	4
On Graduated Optimization for Stochastic Non-Convex Problems	✅	❌	✅	❌	❌	❌	✅	3
On collapsed representation of hierarchical Completely Random Measures	❌	❌	✅	❌	❌	❌	✅	2
On the Analysis of Complex Backup Strategies in Monte Carlo Tree Search	✅	❌	✅	❌	❌	❌	✅	3
On the Consistency of Feature Selection With Lasso for Non-linear Targets	❌	❌	❌	❌	❌	❌	✅	1
On the Iteration Complexity of Oblivious First-Order Optimization Algorithms	✅	❌	❌	❌	❌	❌	❌	1
On the Power and Limits of Distance-Based Learning	❌	❌	❌	❌	❌	❌	❌	0
On the Quality of the Initial Basin in Overspecified Neural Networks	❌	❌	❌	❌	❌	❌	❌	0
On the Statistical Limits of Convex Relaxations	❌	❌	❌	❌	❌	❌	❌	0
One-Shot Generalization in Deep Generative Models	❌	❌	✅	❌	❌	❌	✅	2
Online Learning with Feedback Graphs Without the Graphs	✅	❌	❌	❌	❌	❌	❌	1
Online Low-Rank Subspace Clustering by Basis Dictionary Pursuit	✅	❌	✅	❌	❌	❌	✅	3
Online Stochastic Linear Optimization under One-bit Feedback	✅	❌	❌	❌	❌	❌	✅	2
Opponent Modeling in Deep Reinforcement Learning	❌	✅	✅	✅	❌	❌	✅	4
Optimal Classification with Multivariate Losses	✅	❌	✅	❌	❌	❌	❌	2
Optimality of Belief Propagation for Crowdsourced Classification	❌	❌	✅	❌	❌	❌	✅	2
PAC Lower Bounds and Efficient Algorithms for The Max K-Armed Bandit Problem	✅	❌	❌	❌	❌	❌	❌	1
PAC learning of Probabilistic Automaton based on the Method of Moments	✅	❌	✅	❌	❌	✅	✅	4
PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification	✅	❌	✅	❌	❌	❌	✅	3
PHOG: Probabilistic Model for Code	❌	❌	✅	❌	✅	❌	✅	3
Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms	✅	❌	✅	❌	✅	❌	✅	4
Parameter Estimation for Generalized Thurstone Choice Models	❌	❌	❌	❌	❌	❌	✅	1
Pareto Frontier Learning with Expensive Correlated Objectives	❌	❌	✅	❌	❌	❌	✅	2
Partition Functions from Rao-Blackwellized Tempered Sampling	✅	❌	✅	❌	❌	❌	✅	3
Persistence weighted Gaussian kernel for topological data analysis	❌	❌	❌	✅	❌	❌	✅	2
Persistent RNNs: Stashing Recurrent Weights On-Chip	❌	✅	❌	❌	✅	❌	✅	3
Pixel Recurrent Neural Networks	❌	❌	✅	✅	❌	❌	✅	3
Pliable Rejection Sampling	✅	❌	✅	❌	❌	❌	✅	3
Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms	✅	❌	✅	✅	❌	❌	✅	4
Power of Ordered Hypothesis Testing	❌	❌	✅	❌	❌	❌	✅	2
Preconditioning Kernel Matrices	✅	✅	✅	✅	✅	❌	✅	6
Predictive Entropy Search for Multi-objective Bayesian Optimization	❌	❌	✅	✅	❌	❌	✅	3
Pricing a Low-regret Seller	✅	❌	❌	❌	❌	❌	✅	2
Primal-Dual Rates and Certificates	✅	❌	✅	❌	❌	❌	✅	3
Principal Component Projection Without Principal Component Analysis	✅	❌	✅	❌	❌	❌	✅	3
Provable Algorithms for Inference in Topic Models	✅	✅	❌	❌	❌	❌	✅	3
Provable Non-convex Phase Retrieval with Outliers: Median TruncatedWirtinger Flow	✅	❌	❌	❌	❌	❌	✅	2
Quadratic Optimization with Orthogonality Constraints: Explicit Lojasiewicz Exponent and Linear Convergence of Line-Search Methods	✅	❌	✅	❌	❌	❌	✅	3
Recommendations as Treatments: Debiasing Learning and Evaluation	❌	✅	✅	✅	❌	❌	✅	4
Recovery guarantee of weighted low-rank approximation via alternating minimization	✅	❌	❌	❌	❌	❌	❌	1
Recurrent Orthogonal Networks and Long-Memory Tasks	❌	❌	✅	❌	❌	❌	✅	2
Recycling Randomness with Structure for Sublinear time Kernel Expansions	❌	❌	✅	❌	✅	✅	✅	4
Representational Similarity Learning with Application to Brain Networks	❌	❌	❌	✅	❌	❌	✅	2
Revisiting Semi-Supervised Learning with Graph Embeddings	✅	❌	✅	❌	❌	❌	✅	3
Rich Component Analysis	✅	❌	✅	❌	❌	❌	✅	3
Robust Monte Carlo Sampling using Riemannian Nosé-Poincaré Hamiltonian Dynamics	✅	❌	✅	✅	❌	❌	✅	4
Robust Principal Component Analysis with Side Information	✅	❌	✅	❌	❌	❌	✅	3
Robust Random Cut Forest Based Anomaly Detection on Streams	✅	❌	✅	✅	❌	❌	✅	4
SDCA without Duality, Regularization, and Individual Convexity	✅	❌	❌	❌	❌	❌	❌	1
SDNA: Stochastic Dual Newton Ascent for Empirical Risk Minimization	✅	❌	✅	❌	❌	❌	✅	3
Scalable Discrete Sampling as a Multi-Armed Bandit Problem	✅	❌	❌	❌	❌	❌	✅	2
Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters	❌	✅	✅	✅	❌	❌	✅	4
Sequence to Sequence Training of CTC-RNNs with Partial Windowing	✅	❌	✅	✅	✅	❌	✅	5
Shifting Regret, Mirror Descent, and Matrices	✅	❌	❌	❌	❌	❌	❌	1
Simultaneous Safe Screening of Features and Samples in Doubly Sparse Modeling	❌	✅	✅	❌	✅	❌	✅	4
Slice Sampling on Hamiltonian Trajectories	✅	❌	✅	❌	❌	❌	✅	3
Smooth Imitation Learning for Online Sequence Prediction	✅	✅	✅	❌	❌	❌	❌	3
Softened Approximate Policy Iteration for Markov Games	✅	❌	✅	❌	❌	❌	✅	3
Solving Ridge Regression using Sketched Preconditioned SVRG	✅	❌	✅	❌	❌	❌	✅	3
Sparse Nonlinear Regression: Parameter Estimation under Nonconvexity	✅	❌	✅	✅	❌	❌	✅	4
Sparse Parameter Recovery from Aggregated Data	❌	❌	✅	✅	❌	❌	❌	2
Speeding up k-means by approximating Euclidean distances via block vectors	✅	❌	✅	❌	❌	❌	✅	3
Square Root Graphical Models: Multivariate Generalizations of Univariate Exponential Families that Permit Positive Dependencies	❌	❌	✅	❌	✅	❌	✅	3
Stability of Controllers for Gaussian Process Forward Models	✅	❌	❌	❌	❌	❌	✅	2
Starting Small - Learning with Adaptive Sample Sizes	✅	❌	✅	✅	❌	❌	✅	4
Stochastic Block BFGS: Squeezing More Curvature out of Data	✅	✅	✅	❌	❌	❌	✅	4
Stochastic Discrete Clenshaw-Curtis Quadrature	✅	✅	❌	❌	✅	❌	✅	4
Stochastic Optimization for Multiview Representation Learning using Partial Least Squares	✅	❌	✅	✅	❌	❌	✅	4
Stochastic Quasi-Newton Langevin Monte Carlo	✅	❌	✅	✅	✅	❌	✅	5
Stochastic Variance Reduced Optimization for Nonconvex Sparse Learning	✅	❌	✅	❌	❌	❌	✅	3
Stochastic Variance Reduction for Nonconvex Optimization	✅	❌	✅	❌	❌	❌	✅	3
Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues	❌	❌	❌	❌	❌	❌	❌	0
Stratified Sampling Meets Machine Learning	✅	❌	✅	❌	❌	❌	✅	3
Strongly-Typed Recurrent Neural Networks	❌	❌	✅	✅	✅	❌	✅	4
Structure Learning of Partitioned Markov Networks	❌	❌	✅	❌	❌	❌	❌	1
Structured Prediction Energy Networks	❌	❌	✅	✅	❌	❌	✅	3
Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors	❌	❌	✅	✅	❌	❌	✅	3
Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings	❌	✅	✅	✅	✅	❌	✅	5
Tensor Decomposition via Joint Matrix Schur Decomposition	❌	❌	✅	❌	❌	❌	❌	1
Texture Networks: Feed-forward Synthesis of Textures and Stylized Images	❌	❌	✅	❌	❌	❌	✅	2
The Arrow of Time in Multivariate Time Series	✅	✅	✅	❌	❌	❌	✅	4
The Information Sieve	❌	✅	✅	❌	❌	❌	✅	3
The Information-Theoretic Requirements of Subspace Clustering with Missing Data	✅	❌	❌	❌	❌	❌	❌	1
The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks	✅	❌	✅	❌	❌	❌	✅	3
The Label Complexity of Mixed-Initiative Classifier Training	✅	❌	❌	❌	❌	❌	✅	2
The Segmented iHMM: A Simple, Efficient Hierarchical Infinite HMM	❌	❌	✅	✅	❌	❌	✅	3
The Sum-Product Theorem: A Foundation for Learning Tractable Models	✅	❌	❌	❌	❌	❌	✅	2
The Teaching Dimension of Linear Learners	❌	❌	❌	❌	❌	✅	✅	2
The Variational Nystrom method for large-scale spectral problems	❌	❌	✅	❌	❌	❌	✅	2
The knockoff filter for FDR control in group-sparse and multitask regression	❌	❌	✅	❌	❌	✅	✅	3
Towards Faster Rates and Oracle Property for Low-Rank Matrix Estimation	✅	❌	✅	❌	❌	❌	✅	3
Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient	✅	❌	❌	❌	❌	❌	❌	1
Train and Test Tightness of LP Relaxations in Structured Prediction	❌	❌	✅	✅	❌	❌	❌	2
Train faster, generalize better: Stability of stochastic gradient descent	❌	❌	✅	✅	❌	❌	✅	3
Training Deep Neural Networks via Direct Loss Minimization	✅	❌	✅	✅	❌	❌	✅	4
Training Neural Networks Without Gradients: A Scalable ADMM Approach	✅	❌	✅	✅	✅	❌	✅	5
Truthful Univariate Estimators	❌	❌	❌	❌	❌	❌	❌	0
Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units	✅	❌	✅	✅	❌	❌	✅	4
Unitary Evolution Recurrent Neural Networks	❌	✅	✅	❌	❌	❌	✅	3
Unsupervised Deep Embedding for Clustering Analysis	❌	✅	✅	❌	❌	❌	✅	3
Uprooting and Rerooting Graphical Models	❌	❌	❌	❌	❌	✅	✅	2
Variable Elimination in the Fourier Domain	❌	❌	✅	❌	❌	❌	✅	2
Variance Reduction for Faster Non-Convex Optimization	✅	❌	✅	✅	❌	❌	✅	4
Variance-Reduced and Projection-Free Stochastic Optimization	✅	❌	✅	❌	❌	❌	✅	3
Variational Inference for Monte Carlo Objectives	✅	❌	✅	✅	❌	❌	❌	3
Why Most Decisions Are Easy in Tetris—And Perhaps in Other Sequential Decision Problems, As Well	❌	❌	❌	❌	❌	❌	✅	1
Why Regularized Auto-Encoders learn Sparse Representation?	❌	❌	✅	❌	❌	❌	✅	2
k-variates++: more pluses in the k-means++	✅	❌	❌	❌	❌	❌	✅	2