International Conference on Machine Learning (ICML) - 2016

Conference Proceedings:

Key: PC - Pseudocode, OSC - Open Source Code, OSD - Open Datasets, DS - Dataset Splits, HS - Hardware Specification, SD - Software Dependencies, ES - Experiment Setup

A Box-Constrained Approach for Hard Permutation Problems 4
A Comparative Analysis and Study of Multiview CNN Models for Joint Object Categorization and Pose Estimation 2
A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery 2
A Convolutional Attention Network for Extreme Summarization of Source Code 5
A Deep Learning Approach to Unsupervised Ensemble Learning 3
A Distributed Variational Inference Framework for Unifying Parallel Sparse Gaussian Process Regression Models 3
A Kernel Test of Goodness of Fit 2
A Kernelized Stein Discrepancy for Goodness-of-fit Tests 2
A Kronecker-factored approximate Fisher matrix for convolution layers 3
A Neural Autoregressive Approach to Collaborative Filtering 5
A New PAC-Bayesian Perspective on Domain Adaptation 3
A Random Matrix Approach to Echo-State Neural Networks 2
A Self-Correcting Variable-Metric Algorithm for Stochastic Optimization 6
A Simple and Provable Algorithm for Sparse Diagonal CCA 4
A Simple and Strongly-Local Flow-Based Method for Cut Improvement 2
A Subspace Learning Approach for High Dimensional Matrix Decomposition with Efficient Column/Row Sampling 2
A Superlinearly-Convergent Proximal Newton-type Method for the Optimization of Finite Sums 3
A Theory of Generative ConvNet 4
A Variational Analysis of Stochastic Gradient Algorithms 2
A ranking approach to global optimization 2
ADIOS: Architectures Deep In Output Space 5
Accurate Robust and Efficient Error Estimation for Decision Trees 2
Actively Learning Hemimetrics with Applications to Eliciting User Preferences 3
Adaptive Algorithms for Online Convex Optimization with Long-term Constraints 3
Adaptive Sampling for SGD by Exploiting Side Information 4
Additive Approximations in High Dimensional Nonparametric Regression via the SALSA 4
Algorithms for Optimizing the Ratio of Submodular Functions 2
An optimal algorithm for the Thresholding Bandit Problem 2
Analysis of Deep Neural Networks with Extended Data Jacobian Matrix 3
Analysis of Variational Bayesian Factorizations for Sparse and Low-Rank Estimation 1
Anytime Exploration for Multi-armed Bandits using Confidence Information 3
Anytime optimal algorithms in stochastic multi-armed bandits 2
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing 3
Associative Long Short-Term Memory 2
Asymmetric Multi-task Learning Based on Task Relatedness and Loss 3
Asynchronous Methods for Deep Reinforcement Learning 4
Augmenting Supervised Neural Networks with Unsupervised Objectives for Large-scale Image Classification 5
Autoencoding beyond pixels using a learned similarity metric 4
Automatic Construction of Nonparametric Relational Regression Models for Multiple Time Series 2
Auxiliary Deep Generative Models 5
BASC: Applying Bayesian Optimization to the Search for Global Minima on Potential Energy Surfaces 4
BISTRO: An Efficient Relaxation-Based Method for Contextual Bandits 1
Barron and Cover’s Theory in Supervised Learning and its Application to Lasso 1
Bayesian Poisson Tucker Decomposition for Learning the Structure of International Relations 2
Benchmarking Deep Reinforcement Learning for Continuous Control 3
Beyond CCA: Moment Matching for Multi-View Models 3
Beyond Parity Constraints: Fourier Analysis of Hash Functions for Inference 2
Bidirectional Helmholtz Machines 5
Binary embeddings with structured hashed projections 2
Black-Box Alpha Divergence Minimization 4
Black-box Optimization with a Politician 3
Boolean Matrix Factorization and Noisy Completion via Message Passing 4
Bounded Off-Policy Evaluation with Missing Data for Course Recommendation and Curriculum Design 2
Clustering High Dimensional Categorical Data via Topographical Features 3
Collapsed Variational Inference for Sum-Product Networks 5
Community Recovery in Graphs with Locality 4
Complex Embeddings for Simple Link Prediction 3
Compressive Spectral Clustering 6
Computationally Efficient Nyström Approximation using Fast Transforms 3
Conditional Bernoulli Mixtures for Multi-label Classification 5
Conditional Dependence via Shannon Capacity: Axioms, Estimators and Applications 2
Conservative Bandits 2
Contextual Combinatorial Cascading Bandits 3
Continuous Deep Q-Learning with Model-based Acceleration 2
Control of Memory, Active Perception, and Action in Minecraft 1
Controlling the distance to a Kemeny consensus without computing it 2
Convergence of Stochastic Gradient Descent for PCA 1
Convolutional Rectifier Networks as Generalized Tensor Decompositions 0
Copeland Dueling Bandit Problem: Regret Lower Bound, Optimal Algorithm, and Computationally Efficient Algorithm 3
Correcting Forecasts with Multifactor Neural Attention 2
Correlation Clustering and Biclustering with Locally Bounded Errors 1
Cross-Graph Learning of Multi-Relational Associations 4
CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy 3
Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control 3
DCM Bandits: Learning to Rank with Multiple Clicks 2
DR-ABC: Approximate Bayesian Computation with Kernel-Based Distribution Regression 5
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning 2
Data-driven Rank Breaking for Efficient Rank Aggregation 2
Dealbreaker: A Nonlinear Latent Variable Model for Educational Data 3
Deconstructing the Ladder Network Architecture 2
Deep Gaussian Processes for Regression using Approximate Expectation Propagation 3
Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin 3
Deep Structured Energy Based Models for Anomaly Detection 2
Dictionary Learning for Massive Matrix Factorization 6
Differential Geometric Regularization for Supervised Learning of Classifiers 4
Differentially Private Chi-Squared Hypothesis Testing: Goodness of Fit and Independence Testing 2
Differentially Private Policy Evaluation 2
Dirichlet Process Mixture Model for Correcting Technical Variation in Single-Cell Gene Expression Data 2
Discrete Deep Feature Extraction: A Theory and New Architectures 4
Discrete Distribution Estimation under Local Privacy 1
Discriminative Embeddings of Latent Variable Models for Structured Data 5
Distributed Clustering of Linear Bandits in Peer to Peer Networks 1
Diversity-Promoting Bayesian Learning of Latent Variable Models 3
Domain Adaptation with Conditional Transferable Components 3
Doubly Decomposing Nonparametric Tensor Regression 3
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning 3
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning 4
Dropout distillation 3
Dueling Network Architectures for Deep Reinforcement Learning 3
Dynamic Capacity Networks 4
Dynamic Memory Networks for Visual and Textual Question Answering 4
Early and Reliable Event Detection Using Proximity Space Representation 6
Efficient Algorithms for Adversarial Contextual Learning 1
Efficient Algorithms for Large-scale Generalized Eigenvector Computation and Canonical Correlation Analysis 3
Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity 5
Efficient Multi-Instance Learning for Activity Recognition from Time Series Data Using an Auto-Regressive Hidden Markov Model 3
Efficient Private Empirical Risk Minimization for High-dimensional Learning 1
Energetic Natural Gradient Descent 2
Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling 4
Epigraph projections for fast general convex programming 3
Estimating Accuracy from Unlabeled Data: A Bayesian Approach 4
Estimating Cosmological Parameters from the Dark Matter Distribution 3
Estimating Maximum Expected Value through Gaussian Approximation 2
Estimating Structured Vector Autoregressive Models 3
Estimation from Indirect Supervision with Linear Moments 2
Evasion and Hardening of Tree Ensemble Classifiers 5
Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling 2
Exact Exponent in Optimal Rates for Crowdsourcing 0
Experimental Design on a Budget for Sparse Linear Models and Applications 5
Exploiting Cyclic Symmetry in Convolutional Neural Networks 4
Expressiveness of Rectifier Networks 2
Extended and Unscented Kitchen Sinks 3
Extreme F-measure Maximization using Sparse Probability Estimates 5
Factored Temporal Sigmoid Belief Networks for Sequence Learning 2
False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking 4
Fast Algorithms for Segmented Regression 4
Fast Constrained Submodular Maximization: Personalized Data Summarization 3
Fast DPP Sampling for Nystrom with Application to Kernel Methods 4
Fast Parameter Inference in Nonlinear Dynamical Systems using Iterative Gradient Matching 3
Fast Rate Analysis of Some Stochastic Optimization Algorithms 1
Fast Stochastic Algorithms for SVD and PCA: Convergence Properties and Convexity 1
Fast k-Nearest Neighbour Search via Dynamic Continuous Indexing 3
Fast k-means with accurate bounds 3
Fast methods for estimating the Numerical rank of large matrices 4
Faster Convex Optimization: Simulated Annealing with an Efficient Universal Barrier 1
Faster Eigenvector Computation via Shift-and-Invert Preconditioning 0
Fixed Point Quantization of Deep Convolutional Networks 2
ForecastICU: A Prognostic Decision Support System for Timely Prediction of Intensive Care Unit Admission 2
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification 4
Gaussian process nonparametric tensor estimator and its minimax optimality 2
Gaussian quadrature for matrix inverse forms with applications 3
Generalization Properties and Implicit Regularization for Multiple Passes SGM 6
Generalization and Exploration via Randomized Value Functions 2
Generalized Direct Change Estimation in Ising Model Structure 2
Generative Adversarial Text to Image Synthesis 4
Geometric Mean Metric Learning 6
Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions 3
Graying the black box: Understanding DQNs 2
Greedy Column Subset Selection: New Bounds and Distributed Algorithms 3
Gromov-Wasserstein Averaging of Kernel and Distance Matrices 5
Group Equivariant Convolutional Networks 3
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization 2
Hawkes Processes with Stochastic Excitations 1
Heteroscedastic Sequences: Beyond Gaussianity 2
Hierarchical Compound Poisson Factorization 4
Hierarchical Decision Making In Electricity Grid Management 4
Hierarchical Span-Based Conditional Random Fields for Labeling and Segmenting Events in Wearable Sensor Data Streams 4
Hierarchical Variational Models 4
Horizontally Scalable Submodular Maximization 3
How to Fake Multiply by a Gaussian Matrix 5
Hyperparameter optimization with approximate gradient 5
Importance Sampling Tree for Large-scale Empirical Expectation 3
Improved SVRG for Non-Strongly-Convex or Sum-of-Non-Convex Objectives 3
Inference Networks for Sequential Monte Carlo in Graphical Models 1
Interacting Particle Markov Chain Monte Carlo 3
Interactive Bayesian Hierarchical Clustering 2
Isotonic Hawkes Processes 2
K-Means Clustering with Distributed Dimensions 3
L1-regularized Neural Networks are Improperly Learnable in Polynomial Time 4
Large-Margin Softmax Loss for Convolutional Neural Networks 3
Learning Convolutional Neural Networks for Graphs 5
Learning End-to-end Video Classification with Rank-Pooling 4
Learning Granger Causality for Hawkes Processes 3
Learning Mixtures of Plackett-Luce Models 4
Learning Physical Intuition of Block Towers by Example 3
Learning Population-Level Diffusions with Generative RNNs 3
Learning Representations for Counterfactual Inference 4
Learning Simple Algorithms from Examples 1
Learning Sparse Combinatorial Representations via Two-stage Submodular Maximization 3
Learning and Inference via Maximum Inner Product Search 3
Learning from Multiway Data: Simple and Efficient Tensor Regression 4
Learning privately from multiparty data 3
Learning to Filter with Predictive State Inference Machines 3
Learning to Generate with Memory 4
Linking losses for density ratio and class-probability estimation 4
Loss factorization, weakly supervised learning and label noise robustness 3
Low-Rank Matrix Approximation with Stability 4
Low-rank Solutions of Linear Matrix Equations via Procrustes Flow 1
Low-rank tensor completion: a Riemannian manifold preconditioning approach 5
Markov Latent Feature Models 4
Markov-modulated Marked Poisson Processes for Check-in Data 2
Matrix Eigen-decomposition via Doubly Stochastic Riemannian Optimization 3
Meta-Learning with Memory-Augmented Neural Networks 2
Metadata-conscious anonymous messaging 3
Meta–Gradient Boosted Decision Tree Model for Weight and Target Learning 3
Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs 4
Minimizing the Maximal Loss: How and Why 2
Minimum Regret Search for Single- and Multi-Task Optimization 2
Mixing Rates for the Alternating Gibbs Sampler over Restricted Boltzmann Machines and Friends 0
Mixture Proportion Estimation via Kernel Embeddings of Distributions 4
Model-Free Imitation Learning with Policy Optimization 2
Model-Free Trajectory Optimization for Reinforcement Learning 2
Multi-Bias Non-linear Activation in Deep Neural Networks 4
Multi-Player Bandits – a Musical Chairs Approach 2
Near Optimal Behavior via Approximate State Abstraction 2
Network Morphism 4
Neural Variational Inference for Text Processing 3
No Oops, You Won’t Do It Again: Mechanisms for Self-correction in Crowdsourcing 4
No penalty no tears: Least squares in high-dimensional linear models 3
No-Regret Algorithms for Heavy-Tailed Linear Bandits 2
Noisy Activation Functions 4
Non-negative Matrix Factorization under Heavy Noise 4
Nonlinear Statistical Learning with Truncated Gaussian Graphical Models 4
Nonparametric Canonical Correlation Analysis 5
Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks 4
On Graduated Optimization for Stochastic Non-Convex Problems 3
On collapsed representation of hierarchical Completely Random Measures 2
On the Analysis of Complex Backup Strategies in Monte Carlo Tree Search 3
On the Consistency of Feature Selection With Lasso for Non-linear Targets 1
On the Iteration Complexity of Oblivious First-Order Optimization Algorithms 1
On the Power and Limits of Distance-Based Learning 0
On the Quality of the Initial Basin in Overspecified Neural Networks 0
On the Statistical Limits of Convex Relaxations 0
One-Shot Generalization in Deep Generative Models 2
Online Learning with Feedback Graphs Without the Graphs 1
Online Low-Rank Subspace Clustering by Basis Dictionary Pursuit 3
Online Stochastic Linear Optimization under One-bit Feedback 2
Opponent Modeling in Deep Reinforcement Learning 4
Optimal Classification with Multivariate Losses 2
Optimality of Belief Propagation for Crowdsourced Classification 2
PAC Lower Bounds and Efficient Algorithms for The Max K-Armed Bandit Problem 1
PAC learning of Probabilistic Automaton based on the Method of Moments 4
PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification 3
PHOG: Probabilistic Model for Code 3
Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms 4
Parameter Estimation for Generalized Thurstone Choice Models 1
Pareto Frontier Learning with Expensive Correlated Objectives 2
Partition Functions from Rao-Blackwellized Tempered Sampling 3
Persistence weighted Gaussian kernel for topological data analysis 2
Persistent RNNs: Stashing Recurrent Weights On-Chip 3
Pixel Recurrent Neural Networks 3
Pliable Rejection Sampling 3
Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms 4
Power of Ordered Hypothesis Testing 2
Preconditioning Kernel Matrices 6
Predictive Entropy Search for Multi-objective Bayesian Optimization 3
Pricing a Low-regret Seller 2
Primal-Dual Rates and Certificates 3
Principal Component Projection Without Principal Component Analysis 3
Provable Algorithms for Inference in Topic Models 3
Provable Non-convex Phase Retrieval with Outliers: Median TruncatedWirtinger Flow 2
Quadratic Optimization with Orthogonality Constraints: Explicit Lojasiewicz Exponent and Linear Convergence of Line-Search Methods 3
Recommendations as Treatments: Debiasing Learning and Evaluation 4
Recovery guarantee of weighted low-rank approximation via alternating minimization 1
Recurrent Orthogonal Networks and Long-Memory Tasks 2
Recycling Randomness with Structure for Sublinear time Kernel Expansions 4
Representational Similarity Learning with Application to Brain Networks 2
Revisiting Semi-Supervised Learning with Graph Embeddings 3
Rich Component Analysis 3
Robust Monte Carlo Sampling using Riemannian Nosé-Poincaré Hamiltonian Dynamics 4
Robust Principal Component Analysis with Side Information 3
Robust Random Cut Forest Based Anomaly Detection on Streams 4
SDCA without Duality, Regularization, and Individual Convexity 1
SDNA: Stochastic Dual Newton Ascent for Empirical Risk Minimization 3
Scalable Discrete Sampling as a Multi-Armed Bandit Problem 2
Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters 4
Sequence to Sequence Training of CTC-RNNs with Partial Windowing 5
Shifting Regret, Mirror Descent, and Matrices 1
Simultaneous Safe Screening of Features and Samples in Doubly Sparse Modeling 4
Slice Sampling on Hamiltonian Trajectories 3
Smooth Imitation Learning for Online Sequence Prediction 3
Softened Approximate Policy Iteration for Markov Games 3
Solving Ridge Regression using Sketched Preconditioned SVRG 3
Sparse Nonlinear Regression: Parameter Estimation under Nonconvexity 4
Sparse Parameter Recovery from Aggregated Data 2
Speeding up k-means by approximating Euclidean distances via block vectors 3
Square Root Graphical Models: Multivariate Generalizations of Univariate Exponential Families that Permit Positive Dependencies 3
Stability of Controllers for Gaussian Process Forward Models 2
Starting Small - Learning with Adaptive Sample Sizes 4
Stochastic Block BFGS: Squeezing More Curvature out of Data 4
Stochastic Discrete Clenshaw-Curtis Quadrature 4
Stochastic Optimization for Multiview Representation Learning using Partial Least Squares 4
Stochastic Quasi-Newton Langevin Monte Carlo 5
Stochastic Variance Reduced Optimization for Nonconvex Sparse Learning 3
Stochastic Variance Reduction for Nonconvex Optimization 3
Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues 0
Stratified Sampling Meets Machine Learning 3
Strongly-Typed Recurrent Neural Networks 4
Structure Learning of Partitioned Markov Networks 1
Structured Prediction Energy Networks 3
Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors 3
Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings 5
Tensor Decomposition via Joint Matrix Schur Decomposition 1
Texture Networks: Feed-forward Synthesis of Textures and Stylized Images 2
The Arrow of Time in Multivariate Time Series 4
The Information Sieve 3
The Information-Theoretic Requirements of Subspace Clustering with Missing Data 1
The Knowledge Gradient for Sequential Decision Making with Stochastic Binary Feedbacks 3
The Label Complexity of Mixed-Initiative Classifier Training 2
The Segmented iHMM: A Simple, Efficient Hierarchical Infinite HMM 3
The Sum-Product Theorem: A Foundation for Learning Tractable Models 2
The Teaching Dimension of Linear Learners 2
The Variational Nystrom method for large-scale spectral problems 2
The knockoff filter for FDR control in group-sparse and multitask regression 3
Towards Faster Rates and Oracle Property for Low-Rank Matrix Estimation 3
Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient 1
Train and Test Tightness of LP Relaxations in Structured Prediction 2
Train faster, generalize better: Stability of stochastic gradient descent 3
Training Deep Neural Networks via Direct Loss Minimization 4
Training Neural Networks Without Gradients: A Scalable ADMM Approach 5
Truthful Univariate Estimators 0
Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units 4
Unitary Evolution Recurrent Neural Networks 3
Unsupervised Deep Embedding for Clustering Analysis 3
Uprooting and Rerooting Graphical Models 2
Variable Elimination in the Fourier Domain 2
Variance Reduction for Faster Non-Convex Optimization 4
Variance-Reduced and Projection-Free Stochastic Optimization 3
Variational Inference for Monte Carlo Objectives 3
Why Most Decisions Are Easy in Tetris—And Perhaps in Other Sequential Decision Problems, As Well 1
Why Regularized Auto-Encoders learn Sparse Representation? 2
k-variates++: more pluses in the k-means++ 2