International Conference on Machine Learning (ICML) - 2017

Conference Proceedings:

Key: PC - Pseudocode, OSC - Open Source Code, OSD - Open Datasets, DS - Dataset Splits, HS - Hardware Specification, SD - Software Dependencies, ES - Experiment Setup

A Birth-Death Process for Feature Allocation 3
A Closer Look at Memorization in Deep Networks 3
A Distributional Perspective on Reinforcement Learning 3
A Divergence Bound for Hybrids of MCMC and Variational Inference and an Application to Langevin Dynamics and SGVI 2
A Laplacian Framework for Option Discovery in Reinforcement Learning 3
A Richer Theory of Convex Constrained Optimization with Reduced Projections and Improved Rates 3
A Semismooth Newton Method for Fast, Generic Convex Programming 2
A Simple Multi-Class Boosting Framework with Theoretical Guarantees and Empirical Proficiency 3
A Simulated Annealing Based Inexact Oracle for Wasserstein Loss Minimization 4
A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions 0
A Unified Variance Reduction-Based Framework for Nonconvex Low-Rank Matrix Recovery 3
A Unified View of Multi-Label Performance Measures 3
Accelerating Eulerian Fluid Simulation With Convolutional Networks 5
Active Heteroscedastic Regression 2
Active Learning for Accurate Estimation of Linear Models 2
Active Learning for Cost-Sensitive Classification 3
Active Learning for Top-$K$ Rank Aggregation from Noisy Comparisons 3
AdaNet: Adaptive Structural Learning of Artificial Neural Networks 4
Adapting Kernel Representations Online Using Submodular Maximization 3
Adaptive Consensus ADMM for Distributed Optimization 4
Adaptive Feature Selection: Computationally Efficient Online Sparse Linear Regression under RIP 1
Adaptive Multiple-Arm Identification 2
Adaptive Neural Networks for Efficient Inference 6
Adaptive Sampling Probabilities for Non-Smooth Optimization 3
Adversarial Feature Matching for Text Generation 4
Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks 3
Algebraic Variety Models for High-Rank Matrix Completion 3
Algorithmic Stability and Hypothesis Complexity 0
Algorithms for $\ell_p$ Low-Rank Approximation 2
An Adaptive Test of Independence with Analytic Kernel Embeddings 4
An Alternative Softmax Operator for Reinforcement Learning 3
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis 2
An Efficient, Sparsity-Preserving, Online Algorithm for Low-Rank Approximation 4
An Infinite Hidden Markov Model With Similarity-Biased Transitions 4
Analogical Inference for Multi-relational Embeddings 4
Analysis and Optimization of Graph Decompositions by Lifted Multicuts 3
Analytical Guarantees on Numerical Precision of Deep Neural Networks 2
Approximate Newton Methods and Their Local Convergence 2
Approximate Steepest Coordinate Descent 3
Asymmetric Tri-training for Unsupervised Domain Adaptation 4
Asynchronous Distributed Variational Gaussian Process for Regression 5
Asynchronous Stochastic Gradient Descent with Delay Compensation 5
Attentive Recurrent Comparators 3
Automated Curriculum Learning for Neural Networks 3
Automatic Discovery of the Statistical Types of Variables in a Dataset 4
Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning 4
Axiomatic Attribution for Deep Networks 2
Batched High-dimensional Bayesian Optimization via Structural Kernel Learning 2
Bayesian Boolean Matrix Factorisation 4
Bayesian Models of Data Streams with Hierarchical Power Priors 4
Bayesian Optimization with Tree-structured Dependencies 3
Bayesian inference on random simple graphs with power law degree distributions 3
Being Robust (in High Dimensions) Can Be Practical 4
Beyond Filters: Compact Feature Map for Portable Deep Model 6
Bidirectional Learning for Time-series Models with Hidden Units 3
Boosted Fitted Q-Iteration 3
Bottleneck Conditional Density Estimation 4
Breaking Locality Accelerates Block Gauss-Seidel 5
Canopy Fast Sampling with Cover Trees 3
Capacity Releasing Diffusion for Speed and Locality 2
ChoiceRank: Identifying Preferences from Node Traffic in Networks 5
Clustering High Dimensional Dynamic Data Streams 2
Clustering by Sum of Norms: Stochastic Incremental Algorithm, Convergence and Cluster Recovery 1
Co-clustering through Optimal Transport 3
Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study 2
Coherence Pursuit: Fast, Simple, and Robust Subspace Recovery 3
Coherent Probabilistic Forecasts for Hierarchical Time Series 4
Collect at Once, Use Effectively: Making Non-interactive Locally Private Learning Possible 1
Combined Group and Exclusive Sparsity for Deep Neural Networks 4
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning 5
Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis 2
Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data 4
Compressed Sensing using Generative Models 3
Conditional Accelerated Lazy Stochastic Gradient Descent 2
Conditional Image Synthesis with Auxiliary Classifier GANs 2
Confident Multiple Choice Learning 5
Connected Subgraph Detection with Mirror Descent on SDPs 4
Consistency Analysis for Binary Classification Revisited 3
Consistent On-Line Off-Policy Evaluation 3
Consistent k-Clustering 2
Constrained Policy Optimization 2
Contextual Decision Processes with low Bellman rank are PAC-Learnable 1
Continual Learning Through Synaptic Intelligence 3
Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization 2
Convex Phase Retrieval without Lifting via PhaseMax 0
Convexified Convolutional Neural Networks 5
Convolutional Sequence to Sequence Learning 5
Coordinated Multi-Agent Imitation Learning 2
Coresets for Vector Summarization with Applications to Network Graphs 2
Cost-Optimal Learning of Causal Graphs 3
Count-Based Exploration with Neural Density Models 2
Counterfactual Data-Fusion for Online Reinforcement Learners 2
Coupling Distributed and Symbolic Execution for Natural Language Queries 4
Curiosity-driven Exploration by Self-supervised Prediction 1
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning 2
Dance Dance Convolution 5
Data-Efficient Policy Evaluation Through Behavior Policy Search 2
Deciding How to Decide: Dynamic Routing in Artificial Neural Networks 3
Decoupled Neural Interfaces using Synthetic Gradients 3
Deep Bayesian Active Learning with Image Data 4
Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability 2
Deep Generative Models for Relational Data with Side Information 3
Deep IV: A Flexible Approach for Counterfactual Prediction 2
Deep Latent Dirichlet Allocation with Topic-Layer-Adaptive Stochastic Gradient Riemannian MCMC 4
Deep Spectral Clustering Learning 4
Deep Tensor Convolution on Multicores 5
Deep Transfer Learning with Joint Adaptation Networks 4
Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs 5
Deep Voice: Real-time Neural Text-to-Speech 3
DeepBach: a Steerable Model for Bach Chorales Generation 5
Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction 4
Deletion-Robust Submodular Maximization: Data Summarization with “the Right to be Forgotten” 5
Delta Networks for Optimized Recurrent Network Computation 4
Density Level Set Estimation on Manifolds with DBSCAN 1
Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks 2
Deriving Neural Architectures from Sequence and Graph Kernels 4
Developing Bug-Free Machine Learning Systems With Formal Mathematics 3
Device Placement Optimization with Reinforcement Learning 4
Diameter-Based Active Learning 2
Dictionary Learning Based on Sparse Distribution Tomography 3
Differentiable Programs with Neural Libraries 3
Differentially Private Chi-squared Test by Unit Circle Mechanism 2
Differentially Private Clustering in High-Dimensional Euclidean Spaces 3
Differentially Private Learning of Undirected Graphical Models Using Collective Graphical Models 1
Differentially Private Ordinary Least Squares 3
Differentially Private Submodular Maximization: Data Summarization in Disguise 3
Discovering Discrete Latent Topics with Neural Variational Inference 3
Dissipativity Theory for Nesterov’s Accelerated Method 0
Distributed Batch Gaussian Process Optimization 4
Distributed Mean Estimation with Limited Communication 2
Distributed and Provably Good Seedings for k-Means in Constant Rounds 1
Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition 1
Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization 3
Dropout Inference in Bayesian Neural Networks with Alpha-divergences 3
Dual Iterative Hard Thresholding: From Non-convex Sparse Minimization to Non-smooth Concave Maximization 4
Dual Supervised Learning 5
Dueling Bandits with Weak Regret 3
Dynamic Word Embeddings 2
Efficient Distributed Learning with Sparsity 2
Efficient Nonmyopic Active Search 2
Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret 3
Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections 5
Efficient Regret Minimization in Non-Convex Games 1
Efficient softmax approximation for GPUs 4
Emulating the Expert: Inverse Optimization through Online Learning 3
End-to-End Differentiable Adversarial Imitation Learning 3
End-to-End Learning for Structured Prediction Energy Networks 3
Enumerating Distinct Decision Trees 6
Equivariance Through Parameter-Sharing 0
Estimating individual treatment effect: generalization bounds and algorithms 5
Estimating the unseen from multiple populations 4
Evaluating Bayesian Models with Posterior Dispersion Indices 4
Evaluating the Variance of Likelihood-Ratio Gradient Estimators 5
Exact Inference for Integer Latent-Variable Models 2
Exact MAP Inference by Avoiding Fractional Vertices 2
Exploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms 3
Failures of Gradient-Based Deep Learning 2
Fairness in Reinforcement Learning 0
Fake News Mitigation via Point Process Based Intervention 2
Fast Bayesian Intensity Estimation for the Permanental Process 2
Fast k-Nearest Neighbour Search via Prioritized DCI 4
Faster Greedy MAP Inference for Determinantal Point Processes 5
Faster Principal Component Regression and Stable Matrix Chebyshev Approximation 1
FeUdal Networks for Hierarchical Reinforcement Learning 2
Follow the Compressed Leader: Faster Online Learning of Eigenvectors and Faster MMWU 1
Follow the Moving Leader in Deep Learning 3
Forest-type Regression with General Losses and Robust Forest 4
Forward and Reverse Gradient-Based Hyperparameter Optimization 6
Fractional Langevin Monte Carlo: Exploring Levy Driven Stochastic Differential Equations for Markov Chain Monte Carlo 2
Frame-based Data Factorizations 4
From Patches to Images: A Nonparametric Generative Model 3
GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization 2
Generalization and Equilibrium in Generative Adversarial Nets (GANs) 3
Geometry of Neural Network Loss Surfaces via Random Matrix Theory 2
Global optimization of Lipschitz functions 5
Globally Induced Forest: A Prepruning Compression Scheme 6
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs 0
Gradient Boosted Decision Trees for High Dimensional Sparse Output 4
Gradient Coding: Avoiding Stragglers in Distributed Learning 3
Gradient Projection Iterative Sketch for Large-Scale Constrained Least-Squares 5
Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling 3
Grammar Variational Autoencoder 3
Graph-based Isometry Invariant Representation Learning 4
Guarantees for Greedy Maximization of Non-submodular Functions with Applications 5
Hierarchy Through Composition with Multitask LMDPs 3
High Dimensional Bayesian Optimization with Elastic Gaussian Process 4
High-Dimensional Structured Quantile Regression 1
High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm 2
High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation 1
How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices? 1
How to Escape Saddle Points Efficiently 1
Hyperplane Clustering via Dual Principal Component Pursuit 3
Identification and Model Testing in Linear Structural Equation Models using Auxiliary Variables 1
Identify the Nash Equilibrium in Static Games with Random Payoffs 1
Identifying Best Interventions through Online Importance Sampling 3
Image-to-Markup Generation with Coarse-to-Fine Attention 5
Improved Variational Autoencoders for Text Modeling using Dilated Convolutions 3
Improving Gibbs Sampler Scan Quality with DoGS 4
Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution 2
Improving Viterbi is Hard: Better Runtimes Imply Faster Clique Algorithms 1
Innovation Pursuit: A New Approach to the Subspace Clustering Problem 3
Input Convex Neural Networks 3
Input Switched Affine Networks: An RNN Architecture Designed for Interpretability 2
Interactive Learning from Policy-Dependent Human Feedback 2
Iterative Machine Teaching 2
Joint Dimensionality Reduction and Metric Learning: A Geometric Take 3
Just Sort It! A Simple and Effective Approach to Active Preference Learning 4
Kernelized Support Tensor Machines 5
Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs 2
Language Modeling with Gated Convolutional Networks 3
Large-Scale Evolution of Image Classifiers 4
Latent Feature Lasso 2
Latent Intention Dialogue Models 4
Latent LSTM Allocation: Joint Clustering and Non-Linear Dynamic Modeling of Sequence Data 3
Lazifying Conditional Gradient Algorithms 2
Learned Optimizers that Scale and Generalize 3
Learning Algorithms for Active Learning 4
Learning Continuous Semantic Representations of Symbolic Expressions 5
Learning Deep Architectures via Generalized Whitened Neural Networks 4
Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo 4
Learning Determinantal Point Processes with Moments and Cycles 1
Learning Discrete Representations via Information Maximizing Self-Augmented Training 3
Learning Gradient Descent: Better Generalization and Longer Horizons 4
Learning Hawkes Processes from Short Doubly-Censored Event Sequences 3
Learning Hierarchical Features from Deep Generative Models 2
Learning Important Features Through Propagating Activation Differences 3
Learning Infinite Layer Networks Without the Kernel Trick 1
Learning Latent Space Models with Angular Constraints 4
Learning Sleep Stages from Radio Signals: A Conditional Adversarial Architecture 3
Learning Stable Stochastic Nonlinear Dynamical Systems 3
Learning Texture Manifolds with the Periodic Spatial GAN 4
Learning from Clinical Judgments: Semi-Markov-Modulated Marked Hawkes Processes for Risk Prognosis 2
Learning in POMDPs with Monte Carlo Tree Search 2
Learning the Structure of Generative Models without Labeled Data 4
Learning to Aggregate Ordinal Labels by Maximizing Separating Width 5
Learning to Align the Source Code to the Compiled Object Code 4
Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier 5
Learning to Discover Cross-Domain Relations with Generative Adversarial Networks 3
Learning to Discover Sparse Graphical Models 4
Learning to Generate Long-term Future via Hierarchical Prediction 5
Learning to Learn without Gradient Descent by Gradient Descent 2
Leveraging Node Attributes for Incomplete Relational Data 4
Leveraging Union of Subspace Structure to Improve Constrained Clustering 3
Local Bayesian Optimization of Motor Skills 3
Local-to-Global Bayesian Network Structure Learning 3
Logarithmic Time One-Against-Some 4
Lost Relatives of the Gumbel Trick 4
MEC: Memory-efficient Convolution for Deep Neural Network 4
Magnetic Hamiltonian Monte Carlo 2
Max-value Entropy Search for Efficient Bayesian Optimization 6
Maximum Selection and Ranking under Noisy Comparisons 2
McGan: Mean and Covariance Feature Matching GAN 3
Measuring Sample Quality with Kernels 4
Meritocratic Fairness for Cross-Population Selection 2
Meta Networks 5
Minimax Regret Bounds for Reinforcement Learning 1
Minimizing Trust Leaks for Robust Sybil Detection 2
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks 5
Model-Independent Online Learning for Influence Maximization 4
Modular Multitask Reinforcement Learning with Policy Sketches 3
Multi-Class Optimal Margin Distribution Machine 6
Multi-fidelity Bayesian Optimisation with Continuous Approximations 4
Multi-objective Bandits: Optimizing the Generalized Gini Index 2
Multi-task Learning with Labeled and Unlabeled Tasks 4
Multichannel End-to-end Speech Recognition 3
Multilabel Classification with Group Testing and Codes 2
Multilevel Clustering via Wasserstein Means 5
Multiple Clustering Views from Multiple Uncertain Experts 3
Multiplicative Normalizing Flows for Variational Bayesian Neural Networks 3
Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter 1
Near-Optimal Design of Experiments via Regret Minimization 2
Nearly Optimal Robust Matrix Completion 2
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders 2
Neural Episodic Control 4
Neural Message Passing for Quantum Chemistry 3
Neural Networks and Rational Functions 0
Neural Optimizer Search with Reinforcement Learning 3
Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks 3
No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis 0
Nonnegative Matrix Factorization for Time Series Recovery From a Few Temporal Aggregates 4
Nonparanormal Information Estimation 2
Nyström Method with Kernel K-means++ Samples as Landmarks 2
On Approximation Guarantees for Greedy Low Rank Optimization 2
On Calibration of Modern Neural Networks 2
On Context-Dependent Clustering of Bandits 4
On Kernelized Multi-armed Bandits 3
On Mixed Memberships and Symmetric Nonnegative Matrix Factorizations 3
On Relaxing Determinism in Arithmetic Circuits 0
On The Projection Operator to A Three-view Cardinality Constrained Set 2
On orthogonality and learning recurrent networks with long term dependencies 3
On the Expressive Power of Deep Neural Networks 2
On the Iteration Complexity of Support Recovery via Hard Thresholding Pursuit 1
On the Sampling Problem for Kernel Quadrature 2
Online Learning to Rank in Stochastic Click Models 3
Online Learning with Local Permutations and Delayed Feedback 2
Online Partial Least Square Optimization: Dropping Convexity for Better Efficiency and Scalability 2
Online and Linear-Time Attention by Enforcing Monotonic Alignments 5
OptNet: Differentiable Optimization as a Layer in Neural Networks 3
Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks 2
Optimal Densification for Fast and Accurate Minwise Hashing 4
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits 3
Oracle Complexity of Second-Order Methods for Finite-Sum Problems 0
Ordinal Graphical Models: A Tale of Two Approaches 3
Orthogonalized ALS: A Theoretically Principled Tensor Decomposition Algorithm for Practical Use 4
Pain-Free Random Differential Privacy with Sensitivity Sampling 2
Parallel Multiscale Autoregressive Density Estimation 4
Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space 3
Parseval Networks: Improving Robustness to Adversarial Examples 4
Partitioned Tensor Factorizations for Learning Mixed Membership Models 5
PixelCNN Models with Auxiliary Variables for Natural Image Modeling 3
Post-Inference Prior Swapping 2
Practical Gauss-Newton Optimisation for Deep Learning 4
Prediction and Control with Temporal Segment Models 2
Prediction under Uncertainty in Sparse Spectrum Gaussian Processes with Applications to Filtering and Control 2
Preferential Bayesian Optimization 3
Priv’IT: Private and Sample Efficient Identity Testing 3
Probabilistic Path Hamiltonian Monte Carlo 5
Probabilistic Submodular Maximization in Sub-Linear Time 4
Programming with a Differentiable Forth Interpreter 2
Projection-free Distributed Online Learning in Networks 3
ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices 6
Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations 4
Provably Optimal Algorithms for Generalized Linear Contextual Bandits 1
Prox-PDA: The Proximal Primal-Dual Algorithm for Fast Distributed Nonconvex Optimization and Learning Over Networks 3
Random Feature Expansions for Deep Gaussian Processes 3
Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees 0
Re-revisiting Learning on Hypergraphs: Confidence Interval and Subgradient Method 3
Real-Time Adaptive Image Compression 3
Recovery Guarantees for One-hidden-layer Neural Networks 2
Recurrent Highway Networks 1
Recursive Partitioning for Personalization using Observational Data 4
Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning 2
Regret Minimization in Behaviorally-Constrained Zero-Sum Games 3
Regularising Non-linear Models Using Feature Side-information 3
Reinforcement Learning with Deep Energy-Based Policies 3
Relative Fisher Information and Natural Gradient for Learning Large Modular Models 3
Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things 6
Risk Bounds for Transferring Representations With and Without Fine-Tuning 2
Robust Adversarial Reinforcement Learning 3
Robust Budget Allocation via Continuous Submodular Functions 5
Robust Gaussian Graphical Model Estimation with Arbitrary Corruption 3
Robust Guarantees of Stochastic Greedy Algorithms 3
Robust Probabilistic Modeling with Bayesian Data Reweighting 2
Robust Structured Estimation with Single-Index Models 1
Robust Submodular Maximization: A Non-Uniform Partitioning Approach 3
RobustFill: Neural Program Learning under Noisy I/O 3
Rule-Enhanced Penalized Regression by Column Generation using Rectangular Maximum Agreement 5
SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient 3
SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling 3
Safety-Aware Algorithms for Adversarial Contextual Bandit 1
Scalable Bayesian Rule Lists 5
Scalable Generative Models for Multi-label Learning with Missing Labels 3
Scalable Multi-Class Gaussian Process Classification using Expectation Propagation 3
Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction 4
Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics 2
Second-Order Kernel Online Convex Optimization with Adaptive Sketching 1
Selective Inference for Sparse High-Order Interaction Models 2
Self-Paced Co-training 4
Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data 4
Sequence Modeling via Segmentations 3
Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control 3
Sequence to Better Sequence: Continuous Revision of Combinatorial Structures 2
Sharp Minima Can Generalize For Deep Nets 0
Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation 5
Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging 1
Sliced Wasserstein Kernel for Persistence Diagrams 5
Soft-DTW: a Differentiable Loss Function for Time-Series 5
Source-Target Similarity Modelings for Multi-Source Transfer Gaussian Process Regression 2
Sparse + Group-Sparse Dirty Models: Statistical Guarantees without Unreasonable Conditions and a Case for Non-Convexity 3
Spectral Learning from a Single Trajectory under Finite-State Policies 1
Spherical Structured Feature Maps for Kernel Approximation 3
SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization 6
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning 1
State-Frequency Memory Recurrent Neural Networks 3
Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening 1
StingyCD: Safely Avoiding Wasteful Updates in Coordinate Descent 3
Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values 4
Stochastic Bouncy Particle Sampler 4
Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence 3
Stochastic DCA for the Large-sum of Non-convex Functions Problem and its Application to Group Variable Selection in Classification 5
Stochastic Generative Hashing 5
Stochastic Gradient MCMC Methods for Hidden Markov Models 3
Stochastic Gradient Monomial Gamma Sampler 4
Stochastic Modified Equations and Adaptive Stochastic Gradient Algorithms 3
Stochastic Variance Reduction Methods for Policy Evaluation 3
Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions 0
Strongly-Typed Agents are Guaranteed to Interact Safely 0
Sub-sampled Cubic Regularization for Non-convex Optimization 2
Tensor Balancing on Statistical Manifold 5
Tensor Belief Propagation 5
Tensor Decomposition via Simultaneous Power Iteration 1
Tensor Decomposition with Smoothness 2
Tensor-Train Recurrent Neural Networks for Video Classification 5
The Loss Surface of Deep and Wide Neural Networks 0
The Predictron: End-To-End Learning and Planning 1
The Price of Differential Privacy for Online Learning 1
The Sample Complexity of Online One-Class Collaborative Filtering 2
The Shattered Gradients Problem: If resnets are the answer, then what is the question? 2
The Statistical Recurrent Unit 4
Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank 0
Tight Bounds for Approximate Carathéodory and Beyond 1
Toward Controlled Generation of Text 4
Toward Efficient and Accurate Covariance Matrix Estimation on Compressed Data 4
Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering 4
Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs 5
Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference 1
Uncorrelation and Evenness: a New Diversity-Promoting Regularizer 3
Uncovering Causality from Multivariate Hawkes Integrated Cumulants 4
Understanding Black-box Predictions via Influence Functions 4
Understanding Synthetic Gradients and Decoupled Neural Interfaces 2
Understanding the Representation and Computation of Multilayer Perceptrons: A Case Study in Speech Recognition 3
Uniform Convergence Rates for Kernel Density Estimation 0
Uniform Deviation Bounds for k-Means Clustering 0
Unifying Task Specification in Reinforcement Learning 2
Unimodal Probability Distributions for Deep Ordinal Classification 4
Unsupervised Learning by Predicting Noise 4
Variants of RMSProp and Adagrad with Logarithmic Regret Bounds 3
Variational Boosting: Iteratively Refining Posterior Approximations 4
Variational Dropout Sparsifies Deep Neural Networks 3
Variational Inference for Sparse and Undirected Models 1
Variational Policy for Guiding Point Processes 3
Video Pixel Networks 3
Warped Convolutions: Efficient Invariance to Spatial Transformations 4
Wasserstein Generative Adversarial Networks 3
When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, $\ell_2$-consistency and Neuroscience Applications 3
Why is Posterior Sampling Better than Optimism for Reinforcement Learning? 2
World of Bits: An Open-Domain Platform for Web-Based Agents 1
Zero-Inflated Exponential Family Embeddings 3
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning 2
ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning 3
Zonotope Hit-and-run for Efficient Sampling from Projection DPPs 3
iSurvive: An Interpretable, Event-time Prediction Model for mHealth 3
meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting 4
“Convex Until Proven Guilty”: Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions 3