| A Birth-Death Process for Feature Allocation |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| A Closer Look at Memorization in Deep Networks |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| A Distributional Perspective on Reinforcement Learning |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| A Divergence Bound for Hybrids of MCMC and Variational Inference and an Application to Langevin Dynamics and SGVI |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| A Laplacian Framework for Option Discovery in Reinforcement Learning |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| A Richer Theory of Convex Constrained Optimization with Reduced Projections and Improved Rates |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| A Semismooth Newton Method for Fast, Generic Convex Programming |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| A Simple Multi-Class Boosting Framework with Theoretical Guarantees and Empirical Proficiency |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| A Simulated Annealing Based Inexact Oracle for Wasserstein Loss Minimization |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| A Unified Variance Reduction-Based Framework for Nonconvex Low-Rank Matrix Recovery |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| A Unified View of Multi-Label Performance Measures |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Accelerating Eulerian Fluid Simulation With Convolutional Networks |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Active Heteroscedastic Regression |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Active Learning for Accurate Estimation of Linear Models |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Active Learning for Cost-Sensitive Classification |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Active Learning for Top-$K$ Rank Aggregation from Noisy Comparisons |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
3 |
| AdaNet: Adaptive Structural Learning of Artificial Neural Networks |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Adapting Kernel Representations Online Using Submodular Maximization |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Adaptive Consensus ADMM for Distributed Optimization |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Adaptive Feature Selection: Computationally Efficient Online Sparse Linear Regression under RIP |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Adaptive Multiple-Arm Identification |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Adaptive Neural Networks for Efficient Inference |
✅ |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| Adaptive Sampling Probabilities for Non-Smooth Optimization |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Adversarial Feature Matching for Text Generation |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Algebraic Variety Models for High-Rank Matrix Completion |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Algorithmic Stability and Hypothesis Complexity |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Algorithms for $\ell_p$ Low-Rank Approximation |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| An Adaptive Test of Independence with Analytic Kernel Embeddings |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| An Alternative Softmax Operator for Reinforcement Learning |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| An Efficient, Sparsity-Preserving, Online Algorithm for Low-Rank Approximation |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| An Infinite Hidden Markov Model With Similarity-Biased Transitions |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Analogical Inference for Multi-relational Embeddings |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Analysis and Optimization of Graph Decompositions by Lifted Multicuts |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Analytical Guarantees on Numerical Precision of Deep Neural Networks |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Approximate Newton Methods and Their Local Convergence |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Approximate Steepest Coordinate Descent |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Asymmetric Tri-training for Unsupervised Domain Adaptation |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Asynchronous Distributed Variational Gaussian Process for Regression |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Asynchronous Stochastic Gradient Descent with Delay Compensation |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Attentive Recurrent Comparators |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Automated Curriculum Learning for Neural Networks |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Automatic Discovery of the Statistical Types of Variables in a Dataset |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Axiomatic Attribution for Deep Networks |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Batched High-dimensional Bayesian Optimization via Structural Kernel Learning |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Bayesian Boolean Matrix Factorisation |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Bayesian Models of Data Streams with Hierarchical Power Priors |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Bayesian Optimization with Tree-structured Dependencies |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Bayesian inference on random simple graphs with power law degree distributions |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Being Robust (in High Dimensions) Can Be Practical |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Beyond Filters: Compact Feature Map for Portable Deep Model |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Bidirectional Learning for Time-series Models with Hidden Units |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Boosted Fitted Q-Iteration |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Bottleneck Conditional Density Estimation |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Breaking Locality Accelerates Block Gauss-Seidel |
✅ |
❌ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| Canopy Fast Sampling with Cover Trees |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Capacity Releasing Diffusion for Speed and Locality |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| ChoiceRank: Identifying Preferences from Node Traffic in Networks |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Clustering High Dimensional Dynamic Data Streams |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Clustering by Sum of Norms: Stochastic Incremental Algorithm, Convergence and Cluster Recovery |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Co-clustering through Optimal Transport |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Coherence Pursuit: Fast, Simple, and Robust Subspace Recovery |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Coherent Probabilistic Forecasts for Hierarchical Time Series |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Collect at Once, Use Effectively: Making Non-interactive Locally Private Learning Possible |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Combined Group and Exclusive Sparsity for Deep Neural Networks |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
4 |
| Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Compressed Sensing using Generative Models |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Conditional Accelerated Lazy Stochastic Gradient Descent |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Conditional Image Synthesis with Auxiliary Classifier GANs |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Confident Multiple Choice Learning |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Connected Subgraph Detection with Mirror Descent on SDPs |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Consistency Analysis for Binary Classification Revisited |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
3 |
| Consistent On-Line Off-Policy Evaluation |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Consistent k-Clustering |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Constrained Policy Optimization |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
2 |
| Contextual Decision Processes with low Bellman rank are PAC-Learnable |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Continual Learning Through Synaptic Intelligence |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Convex Phase Retrieval without Lifting via PhaseMax |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Convexified Convolutional Neural Networks |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Convolutional Sequence to Sequence Learning |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Coordinated Multi-Agent Imitation Learning |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Coresets for Vector Summarization with Applications to Network Graphs |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Cost-Optimal Learning of Causal Graphs |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Count-Based Exploration with Neural Density Models |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Counterfactual Data-Fusion for Online Reinforcement Learners |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Coupling Distributed and Symbolic Execution for Natural Language Queries |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Curiosity-driven Exploration by Self-supervised Prediction |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
1 |
| DARLA: Improving Zero-Shot Transfer in Reinforcement Learning |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Dance Dance Convolution |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Data-Efficient Policy Evaluation Through Behavior Policy Search |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Deciding How to Decide: Dynamic Routing in Artificial Neural Networks |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Decoupled Neural Interfaces using Synthetic Gradients |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Deep Bayesian Active Learning with Image Data |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Deep Generative Models for Relational Data with Side Information |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Deep IV: A Flexible Approach for Counterfactual Prediction |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Deep Latent Dirichlet Allocation with Topic-Layer-Adaptive Stochastic Gradient Riemannian MCMC |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Deep Spectral Clustering Learning |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Deep Tensor Convolution on Multicores |
✅ |
❌ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| Deep Transfer Learning with Joint Adaptation Networks |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Deep Voice: Real-time Neural Text-to-Speech |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| DeepBach: a Steerable Model for Bach Chorales Generation |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Deletion-Robust Submodular Maximization: Data Summarization with “the Right to be Forgotten” |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Delta Networks for Optimized Recurrent Network Computation |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Density Level Set Estimation on Manifolds with DBSCAN |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
2 |
| Deriving Neural Architectures from Sequence and Graph Kernels |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Developing Bug-Free Machine Learning Systems With Formal Mathematics |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
3 |
| Device Placement Optimization with Reinforcement Learning |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Diameter-Based Active Learning |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Dictionary Learning Based on Sparse Distribution Tomography |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Differentiable Programs with Neural Libraries |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Differentially Private Chi-squared Test by Unit Circle Mechanism |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Differentially Private Clustering in High-Dimensional Euclidean Spaces |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Differentially Private Learning of Undirected Graphical Models Using Collective Graphical Models |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Differentially Private Ordinary Least Squares |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Differentially Private Submodular Maximization: Data Summarization in Disguise |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Discovering Discrete Latent Topics with Neural Variational Inference |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Dissipativity Theory for Nesterov’s Accelerated Method |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Distributed Batch Gaussian Process Optimization |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Distributed Mean Estimation with Limited Communication |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Distributed and Provably Good Seedings for k-Means in Constant Rounds |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Dropout Inference in Bayesian Neural Networks with Alpha-divergences |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Dual Iterative Hard Thresholding: From Non-convex Sparse Minimization to Non-smooth Concave Maximization |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Dual Supervised Learning |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Dueling Bandits with Weak Regret |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Dynamic Word Embeddings |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Efficient Distributed Learning with Sparsity |
✅ |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
2 |
| Efficient Nonmyopic Active Search |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Efficient Regret Minimization in Non-Convex Games |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Efficient softmax approximation for GPUs |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Emulating the Expert: Inverse Optimization through Online Learning |
✅ |
❌ |
❌ |
❌ |
✅ |
✅ |
❌ |
3 |
| End-to-End Differentiable Adversarial Imitation Learning |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| End-to-End Learning for Structured Prediction Energy Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Enumerating Distinct Decision Trees |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Equivariance Through Parameter-Sharing |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Estimating individual treatment effect: generalization bounds and algorithms |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Estimating the unseen from multiple populations |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Evaluating Bayesian Models with Posterior Dispersion Indices |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Evaluating the Variance of Likelihood-Ratio Gradient Estimators |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Exact Inference for Integer Latent-Variable Models |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Exact MAP Inference by Avoiding Fractional Vertices |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Exploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Failures of Gradient-Based Deep Learning |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Fairness in Reinforcement Learning |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Fake News Mitigation via Point Process Based Intervention |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Fast Bayesian Intensity Estimation for the Permanental Process |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Fast k-Nearest Neighbour Search via Prioritized DCI |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Faster Greedy MAP Inference for Determinantal Point Processes |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Faster Principal Component Regression and Stable Matrix Chebyshev Approximation |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| FeUdal Networks for Hierarchical Reinforcement Learning |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Follow the Compressed Leader: Faster Online Learning of Eigenvectors and Faster MMWU |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Follow the Moving Leader in Deep Learning |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Forest-type Regression with General Losses and Robust Forest |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Forward and Reverse Gradient-Based Hyperparameter Optimization |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Fractional Langevin Monte Carlo: Exploring Levy Driven Stochastic Differential Equations for Markov Chain Monte Carlo |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Frame-based Data Factorizations |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| From Patches to Images: A Nonparametric Generative Model |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Generalization and Equilibrium in Generative Adversarial Nets (GANs) |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Geometry of Neural Network Loss Surfaces via Random Matrix Theory |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Global optimization of Lipschitz functions |
✅ |
❌ |
✅ |
✅ |
❌ |
✅ |
✅ |
5 |
| Globally Induced Forest: A Prepruning Compression Scheme |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
6 |
| Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Gradient Boosted Decision Trees for High Dimensional Sparse Output |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Gradient Coding: Avoiding Stragglers in Distributed Learning |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
❌ |
3 |
| Gradient Projection Iterative Sketch for Large-Scale Constrained Least-Squares |
✅ |
❌ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Grammar Variational Autoencoder |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
3 |
| Graph-based Isometry Invariant Representation Learning |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Guarantees for Greedy Maximization of Non-submodular Functions with Applications |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
✅ |
5 |
| Hierarchy Through Composition with Multitask LMDPs |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| High Dimensional Bayesian Optimization with Elastic Gaussian Process |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| High-Dimensional Structured Quantile Regression |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices? |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
1 |
| How to Escape Saddle Points Efficiently |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Hyperplane Clustering via Dual Principal Component Pursuit |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Identification and Model Testing in Linear Structural Equation Models using Auxiliary Variables |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Identify the Nash Equilibrium in Static Games with Random Payoffs |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Identifying Best Interventions through Online Importance Sampling |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Image-to-Markup Generation with Coarse-to-Fine Attention |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Improved Variational Autoencoders for Text Modeling using Dilated Convolutions |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Improving Gibbs Sampler Scan Quality with DoGS |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Improving Viterbi is Hard: Better Runtimes Imply Faster Clique Algorithms |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Innovation Pursuit: A New Approach to the Subspace Clustering Problem |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Input Convex Neural Networks |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Input Switched Affine Networks: An RNN Architecture Designed for Interpretability |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
2 |
| Interactive Learning from Policy-Dependent Human Feedback |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Iterative Machine Teaching |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Joint Dimensionality Reduction and Metric Learning: A Geometric Take |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Just Sort It! A Simple and Effective Approach to Active Preference Learning |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
❌ |
4 |
| Kernelized Support Tensor Machines |
✅ |
❌ |
✅ |
✅ |
❌ |
✅ |
✅ |
5 |
| Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Language Modeling with Gated Convolutional Networks |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Large-Scale Evolution of Image Classifiers |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Latent Feature Lasso |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Latent Intention Dialogue Models |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Latent LSTM Allocation: Joint Clustering and Non-Linear Dynamic Modeling of Sequence Data |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Lazifying Conditional Gradient Algorithms |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
❌ |
2 |
| Learned Optimizers that Scale and Generalize |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Learning Algorithms for Active Learning |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Learning Continuous Semantic Representations of Symbolic Expressions |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Learning Deep Architectures via Generalized Whitened Neural Networks |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Learning Determinantal Point Processes with Moments and Cycles |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Learning Discrete Representations via Information Maximizing Self-Augmented Training |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Learning Gradient Descent: Better Generalization and Longer Horizons |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Learning Hawkes Processes from Short Doubly-Censored Event Sequences |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Learning Hierarchical Features from Deep Generative Models |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Learning Important Features Through Propagating Activation Differences |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Learning Infinite Layer Networks Without the Kernel Trick |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Learning Latent Space Models with Angular Constraints |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Learning Sleep Stages from Radio Signals: A Conditional Adversarial Architecture |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
3 |
| Learning Stable Stochastic Nonlinear Dynamical Systems |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Learning Texture Manifolds with the Periodic Spatial GAN |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Learning from Clinical Judgments: Semi-Markov-Modulated Marked Hawkes Processes for Risk Prognosis |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Learning in POMDPs with Monte Carlo Tree Search |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Learning the Structure of Generative Models without Labeled Data |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Learning to Aggregate Ordinal Labels by Maximizing Separating Width |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Learning to Align the Source Code to the Compiled Object Code |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier |
✅ |
✅ |
❌ |
✅ |
✅ |
❌ |
✅ |
5 |
| Learning to Discover Cross-Domain Relations with Generative Adversarial Networks |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Learning to Discover Sparse Graphical Models |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Learning to Generate Long-term Future via Hierarchical Prediction |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Learning to Learn without Gradient Descent by Gradient Descent |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Leveraging Node Attributes for Incomplete Relational Data |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Leveraging Union of Subspace Structure to Improve Constrained Clustering |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Local Bayesian Optimization of Motor Skills |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Local-to-Global Bayesian Network Structure Learning |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
❌ |
3 |
| Logarithmic Time One-Against-Some |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Lost Relatives of the Gumbel Trick |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| MEC: Memory-efficient Convolution for Deep Neural Network |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Magnetic Hamiltonian Monte Carlo |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Max-value Entropy Search for Efficient Bayesian Optimization |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Maximum Selection and Ranking under Noisy Comparisons |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| McGan: Mean and Covariance Feature Matching GAN |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Measuring Sample Quality with Kernels |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Meritocratic Fairness for Cross-Population Selection |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Meta Networks |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Minimax Regret Bounds for Reinforcement Learning |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Minimizing Trust Leaks for Robust Sybil Detection |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Model-Independent Online Learning for Influence Maximization |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Modular Multitask Reinforcement Learning with Policy Sketches |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
3 |
| Multi-Class Optimal Margin Distribution Machine |
✅ |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
6 |
| Multi-fidelity Bayesian Optimisation with Continuous Approximations |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Multi-objective Bandits: Optimizing the Generalized Gini Index |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Multi-task Learning with Labeled and Unlabeled Tasks |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Multichannel End-to-end Speech Recognition |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Multilabel Classification with Group Testing and Codes |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Multilevel Clustering via Wasserstein Means |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Multiple Clustering Views from Multiple Uncertain Experts |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Multiplicative Normalizing Flows for Variational Bayesian Neural Networks |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Near-Optimal Design of Experiments via Regret Minimization |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Nearly Optimal Robust Matrix Completion |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Neural Episodic Control |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Neural Message Passing for Quantum Chemistry |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Neural Networks and Rational Functions |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Neural Optimizer Search with Reinforcement Learning |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Nonnegative Matrix Factorization for Time Series Recovery From a Few Temporal Aggregates |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Nonparanormal Information Estimation |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Nyström Method with Kernel K-means++ Samples as Landmarks |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| On Approximation Guarantees for Greedy Low Rank Optimization |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| On Calibration of Modern Neural Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
2 |
| On Context-Dependent Clustering of Bandits |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| On Kernelized Multi-armed Bandits |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| On Mixed Memberships and Symmetric Nonnegative Matrix Factorizations |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| On Relaxing Determinism in Arithmetic Circuits |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| On The Projection Operator to A Three-view Cardinality Constrained Set |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| On orthogonality and learning recurrent networks with long term dependencies |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| On the Expressive Power of Deep Neural Networks |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| On the Iteration Complexity of Support Recovery via Hard Thresholding Pursuit |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| On the Sampling Problem for Kernel Quadrature |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Online Learning to Rank in Stochastic Click Models |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Online Learning with Local Permutations and Delayed Feedback |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Online Partial Least Square Optimization: Dropping Convexity for Better Efficiency and Scalability |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Online and Linear-Time Attention by Enforcing Monotonic Alignments |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| OptNet: Differentiable Optimization as a Layer in Neural Networks |
❌ |
✅ |
❌ |
❌ |
✅ |
❌ |
✅ |
3 |
| Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Optimal Densification for Fast and Accurate Minwise Hashing |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
❌ |
4 |
| Optimal and Adaptive Off-policy Evaluation in Contextual Bandits |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Oracle Complexity of Second-Order Methods for Finite-Sum Problems |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Ordinal Graphical Models: A Tale of Two Approaches |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Orthogonalized ALS: A Theoretically Principled Tensor Decomposition Algorithm for Practical Use |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Pain-Free Random Differential Privacy with Sensitivity Sampling |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Parallel Multiscale Autoregressive Density Estimation |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Parseval Networks: Improving Robustness to Adversarial Examples |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Partitioned Tensor Factorizations for Learning Mixed Membership Models |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| PixelCNN Models with Auxiliary Variables for Natural Image Modeling |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Post-Inference Prior Swapping |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Practical Gauss-Newton Optimisation for Deep Learning |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Prediction and Control with Temporal Segment Models |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Prediction under Uncertainty in Sparse Spectrum Gaussian Processes with Applications to Filtering and Control |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Preferential Bayesian Optimization |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Priv’IT: Private and Sample Efficient Identity Testing |
✅ |
❌ |
❌ |
❌ |
✅ |
❌ |
✅ |
3 |
| Probabilistic Path Hamiltonian Monte Carlo |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
✅ |
5 |
| Probabilistic Submodular Maximization in Sub-Linear Time |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Programming with a Differentiable Forth Interpreter |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Projection-free Distributed Online Learning in Networks |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Provably Optimal Algorithms for Generalized Linear Contextual Bandits |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Prox-PDA: The Proximal Primal-Dual Algorithm for Fast Distributed Nonconvex Optimization and Learning Over Networks |
✅ |
❌ |
❌ |
❌ |
✅ |
❌ |
✅ |
3 |
| Random Feature Expansions for Deep Gaussian Processes |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Re-revisiting Learning on Hypergraphs: Confidence Interval and Subgradient Method |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Real-Time Adaptive Image Compression |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Recovery Guarantees for One-hidden-layer Neural Networks |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Recurrent Highway Networks |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
1 |
| Recursive Partitioning for Personalization using Observational Data |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Regret Minimization in Behaviorally-Constrained Zero-Sum Games |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Regularising Non-linear Models Using Feature Side-information |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Reinforcement Learning with Deep Energy-Based Policies |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
3 |
| Relative Fisher Information and Natural Gradient for Learning Large Modular Models |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Risk Bounds for Transferring Representations With and Without Fine-Tuning |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Robust Adversarial Reinforcement Learning |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Robust Budget Allocation via Continuous Submodular Functions |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
✅ |
5 |
| Robust Gaussian Graphical Model Estimation with Arbitrary Corruption |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Robust Guarantees of Stochastic Greedy Algorithms |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Robust Probabilistic Modeling with Bayesian Data Reweighting |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Robust Structured Estimation with Single-Index Models |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Robust Submodular Maximization: A Non-Uniform Partitioning Approach |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| RobustFill: Neural Program Learning under Noisy I/O |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Rule-Enhanced Penalized Regression by Column Generation using Rectangular Maximum Agreement |
✅ |
❌ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Safety-Aware Algorithms for Adversarial Contextual Bandit |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Scalable Bayesian Rule Lists |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Scalable Generative Models for Multi-label Learning with Missing Labels |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Scalable Multi-Class Gaussian Process Classification using Expectation Propagation |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Second-Order Kernel Online Convex Optimization with Adaptive Sketching |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Selective Inference for Sparse High-Order Interaction Models |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Self-Paced Co-training |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Sequence Modeling via Segmentations |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Sequence to Better Sequence: Continuous Revision of Combinatorial Structures |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Sharp Minima Can Generalize For Deep Nets |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Sliced Wasserstein Kernel for Persistence Diagrams |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Soft-DTW: a Differentiable Loss Function for Time-Series |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Source-Target Similarity Modelings for Multi-Source Transfer Gaussian Process Regression |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Sparse + Group-Sparse Dirty Models: Statistical Guarantees without Unreasonable Conditions and a Case for Non-Convexity |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Spectral Learning from a Single Trajectory under Finite-State Policies |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Spherical Structured Feature Maps for Kernel Approximation |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| State-Frequency Memory Recurrent Neural Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
1 |
| StingyCD: Safely Avoiding Wasteful Updates in Coordinate Descent |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values |
✅ |
❌ |
❌ |
❌ |
✅ |
✅ |
✅ |
4 |
| Stochastic Bouncy Particle Sampler |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Stochastic DCA for the Large-sum of Non-convex Functions Problem and its Application to Group Variable Selection in Classification |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Stochastic Generative Hashing |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Stochastic Gradient MCMC Methods for Hidden Markov Models |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Stochastic Gradient Monomial Gamma Sampler |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Stochastic Modified Equations and Adaptive Stochastic Gradient Algorithms |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Stochastic Variance Reduction Methods for Policy Evaluation |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Strongly-Typed Agents are Guaranteed to Interact Safely |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Sub-sampled Cubic Regularization for Non-convex Optimization |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| Tensor Balancing on Statistical Manifold |
❌ |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| Tensor Belief Propagation |
✅ |
❌ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| Tensor Decomposition via Simultaneous Power Iteration |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Tensor Decomposition with Smoothness |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Tensor-Train Recurrent Neural Networks for Video Classification |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| The Loss Surface of Deep and Wide Neural Networks |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| The Predictron: End-To-End Learning and Planning |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| The Price of Differential Privacy for Online Learning |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| The Sample Complexity of Online One-Class Collaborative Filtering |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
2 |
| The Shattered Gradients Problem: If resnets are the answer, then what is the question? |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| The Statistical Recurrent Unit |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Tight Bounds for Approximate Carathéodory and Beyond |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Toward Controlled Generation of Text |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Toward Efficient and Accurate Covariance Matrix Estimation on Compressed Data |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
1 |
| Uncorrelation and Evenness: a New Diversity-Promoting Regularizer |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Uncovering Causality from Multivariate Hawkes Integrated Cumulants |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Understanding Black-box Predictions via Influence Functions |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Understanding Synthetic Gradients and Decoupled Neural Interfaces |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Understanding the Representation and Computation of Multilayer Perceptrons: A Case Study in Speech Recognition |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Uniform Convergence Rates for Kernel Density Estimation |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Uniform Deviation Bounds for k-Means Clustering |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Unifying Task Specification in Reinforcement Learning |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Unimodal Probability Distributions for Deep Ordinal Classification |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Unsupervised Learning by Predicting Noise |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Variants of RMSProp and Adagrad with Logarithmic Regret Bounds |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Variational Boosting: Iteratively Refining Posterior Approximations |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Variational Dropout Sparsifies Deep Neural Networks |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Variational Inference for Sparse and Undirected Models |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
1 |
| Variational Policy for Guiding Point Processes |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Video Pixel Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Warped Convolutions: Efficient Invariance to Spatial Transformations |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Wasserstein Generative Adversarial Networks |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, $\ell_2$-consistency and Neuroscience Applications |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Why is Posterior Sampling Better than Optimism for Reinforcement Learning? |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| World of Bits: An Open-Domain Platform for Web-Based Agents |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Zero-Inflated Exponential Family Embeddings |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Zonotope Hit-and-run for Efficient Sampling from Projection DPPs |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| iSurvive: An Interpretable, Event-time Prediction Model for mHealth |
✅ |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
3 |
| meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| “Convex Until Proven Guilty”: Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |