| A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| A Compare-Aggregate Model for Matching Text Sequences |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| A Compositional Object-Based Approach to Learning Physical Dynamics |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
2 |
| A Differentiable Physics Engine for Deep Learning in Robotics |
❌ |
❌ |
❌ |
❌ |
✅ |
❌ |
✅ |
2 |
| A Learned Representation For Artistic Style |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| A STRUCTURED SELF-ATTENTIVE SENTENCE EMBEDDING |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| A Simple but Tough-to-Beat Baseline for Sentence Embeddings |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| A Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Samples |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
2 |
| A recurrent neural network without chaos |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Adaptive Feature Abstraction for Translating Video to Language |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
2 |
| Adversarial Feature Learning |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Adversarial Machine Learning at Scale |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Adversarial Training Methods for Semi-Supervised Text Classification |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Adversarial examples in the physical world |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Adversarially Learned Inference |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
6 |
| Amortised MAP Inference for Image Super-resolution |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| An Actor-Critic Algorithm for Sequence Prediction |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
6 |
| An Information-Theoretic Framework for Fast and Robust Unsupervised Learning via Neural Population Infomax |
✅ |
❌ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| Attend, Adapt and Transfer: Attentive Deep Architecture for Adaptive Transfer from multiple sources in the same domain |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Autoencoding Variational Inference For Topic Models |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Automated Generation of Multilingual Clusters for the Evaluation of Distributed Representations |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
1 |
| Automatic Rule Extraction from Long Short Term Memory Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Batch Policy Gradient Methods for Improving Neural Conversation Models |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Bidirectional Attention Flow for Machine Comprehension |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Bit-Pragmatic Deep Neural Network Computing |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Calibrating Energy-based Generative Adversarial Networks |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Capacity and Trainability in Recurrent Neural Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Categorical Reparameterization with Gumbel-Softmax |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Central Moment Discrepancy (CMD) for Domain-Invariant Representation Learning |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Charged Point Normalization: An Efficient Solution to the Saddle Point Problem |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Combining policy gradient and Q-learning |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Compositional Kernel Machines |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| DSD: Dense-Sparse-Dense Training for Deep Neural Networks |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Data Noising as Smoothing in Neural Network Language Models |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Dataset Augmentation in Feature Space |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Decomposing Motion and Content for Natural Video Sequence Prediction |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Deep Biaffine Attention for Neural Dependency Parsing |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Deep Information Propagation |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Deep Learning with Dynamic Computation Graphs |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Deep Learning with Sets and Point Clouds |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Deep Multi-task Representation Learning: A Tensor Factorisation Approach |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Deep Probabilistic Programming |
✅ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
5 |
| Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Deep Variational Information Bottleneck |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| DeepCoder: Learning to Write Programs |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| DeepDSL: A Compilation-based Domain-Specific Language for Deep Learning |
❌ |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| Delving into Transferable Adversarial Examples and Black-box Attacks |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Density estimation using Real NVP |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Designing Neural Network Architectures using Reinforcement Learning |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Development of JavaScript-based deep learning platform and application to distributed training |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
6 |
| Dialogue Learning With Human-in-the-Loop |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Diet Networks: Thin Parameters for Fat Genomics |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
5 |
| Discovering objects and their relations from entangled scene representations |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Discrete Variational Autoencoders |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Distributed Second-Order Optimization using Kronecker-Factored Approximations |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Do Deep Convolutional Nets Really Need to be Deep and Convolutional? |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Dropout with Expectation-linear Regularization |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Dynamic Coattention Networks For Question Answering |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| EPOpt: Learning Robust Neural Network Policies Using Model Ensembles |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Efficient Representation of Low-Dimensional Manifolds using Deep Networks |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Efficient Softmax Approximation for GPUs |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Efficient Vector Representation for Documents through Corruption |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Emergence of foveal image sampling from learning to attend in visual scenes |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| End-to-end Optimized Image Compression |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Energy-based Generative Adversarial Networks |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Entropy-SGD: Biasing Gradient Descent Into Wide Valleys |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Episodic Exploration for Deep Deterministic Policies for StarCraft Micromanagement |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Exploring Sparsity in Recurrent Neural Networks |
✅ |
❌ |
❌ |
✅ |
✅ |
✅ |
✅ |
5 |
| Exponential Machines |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Extrapolation and learning equations |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| FILTER SHAPING FOR CONVOLUTIONAL NEURAL NETWORKS |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Fast Chirplet Transform to Enhance CNN Machine Listening - Validation on Animal calls and Speech |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Faster CNNs with Direct Sparse Convolutions and Guided Pruning |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
6 |
| Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks |
❌ |
❌ |
❌ |
✅ |
✅ |
❌ |
✅ |
3 |
| FractalNet: Ultra-Deep Neural Networks without Residuals |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Frustratingly Short Attention Spans in Neural Language Modeling |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Gated Multimodal Units for Information Fusion |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Generalizable Features From Unsupervised Learning |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
2 |
| Generalizing Skills with Semi-Supervised Reinforcement Learning |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
3 |
| Generating Interpretable Images with Controllable Structure |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
✅ |
4 |
| Generative Multi-Adversarial Networks |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Geometry of Polysemy |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Hadamard Product for Low-rank Bilinear Pooling |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Hierarchical Multiscale Recurrent Neural Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Highway and Residual Networks learn Unrolled Iterative Estimation |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving |
❌ |
✅ |
✅ |
❌ |
✅ |
❌ |
❌ |
3 |
| HyperNetworks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Hyperband: Bandit-Based Configuration Evaluation for Hyperparameter Optimization |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Identity Matters in Deep Learning |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Improving Generative Adversarial Networks with Denoising Feature Matching |
❌ |
❌ |
✅ |
✅ |
❌ |
✅ |
✅ |
4 |
| Improving Neural Language Models with a Continuous Cache |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Improving Policy Gradient by Exploring Under-appreciated Rewards |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Incorporating long-range consistency in CNN-based texture generation |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Incremental Network Quantization: Towards Lossless CNNs with Low-precision Weights |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Inductive Bias of Deep Convolutional Networks through Pooling Geometry |
❌ |
✅ |
❌ |
✅ |
❌ |
❌ |
✅ |
3 |
| Introspection:Accelerating Neural Network Training By Learning Weight Evolution |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Latent Sequence Decompositions |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Learning Continuous Semantic Representations of Symbolic Expressions |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Learning Curve Prediction with Bayesian Neural Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Learning End-to-End Goal-Oriented Dialog |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Learning Features of Music From Scratch |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Learning Graphical State Transitions |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Learning Invariant Representations Of Planar Curves |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Learning Recurrent Representations for Hierarchical Behavior Modeling |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Learning Visual Servoing with Deep Features and Fitted Q-Iteration |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Learning a Natural Language Interface with Neural Programmer |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Learning in Implicit Generative Models |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Learning through Dialogue Interactions by Asking Questions |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
3 |
| Learning to Act by Predicting the Future |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Learning to Compose Words into Sentences with Reinforcement Learning |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Learning to Discover Sparse Graphical Models |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Learning to Generate Samples from Noise through Infusion Training |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Learning to Navigate in Complex Environments |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Learning to Optimize |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Learning to Perform Physics Experiments via Deep Reinforcement Learning |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Learning to Query, Reason, and Answer Questions On Ambiguous Texts |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
2 |
| Learning to Remember Rare Events |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Learning to superoptimize programs |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Lie-Access Neural Turing Machines |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Lifelong Perceptual Programming By Example |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Loss-aware Binarization of Deep Networks |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Lossy Image Compression with Compressive Autoencoders |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
✅ |
3 |
| Machine Comprehension Using Match-LSTM and Answer Pointer |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Making Neural Programming Architectures Generalize via Recursion |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Maximum Entropy Flow Networks |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Metacontrol for Adaptive Imagination-Based Optimization |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Mode Regularized Generative Adversarial Networks |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Modular Multitask Reinforcement Learning with Policy Sketches |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
3 |
| Modularized Morphing of Neural Networks |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Mollifying Networks |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Multi-Agent Cooperation and the Emergence of (Natural) Language |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Multi-view Recurrent Neural Acoustic Word Embeddings |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Multilayer Recurrent Network Models of Primate Retinal Ganglion Cell Responses |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Multiplicative LSTM for sequence modelling |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Neural Architecture Search with Reinforcement Learning |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Neural Data Filter for Bootstrapping Stochastic Gradient Descent |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Neural Functional Programming |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Neural Photo Editing with Introspective Adversarial Networks |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Neural Program Lattices |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
2 |
| Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks |
❌ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
3 |
| Neuro-Symbolic Program Synthesis |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
2 |
| Nonparametric Neural Networks |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Nonparametrically Learning Activation Functions in Deep Neural Nets |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Offline bilingual word vectors, orthogonal transformations and the inverted softmax |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| On Detecting Adversarial Perturbations |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| On Robust Concepts and Small Neural Nets |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| On the Quantitative Analysis of Decoder-Based Generative Models |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Online Bayesian Transfer Learning for Sequential Data Modeling |
✅ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
3 |
| Online Structure Learning for Sum-Product Networks with Gaussian Leaves |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Optimal Binary Autoencoding with Pairwise Correlations |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Optimization as a Model for Few-Shot Learning |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Paleo: A Performance Model for Deep Neural Networks |
❌ |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Perception Updating Networks: On architectural constraints for interpretable video generative models |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| PixelVAE: A Latent Variable Model for Natural Images |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Pointer Sentinel Mixture Models |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Predicting Medications from Diagnostic Codes with Recurrent Neural Networks |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
2 |
| Program Synthesis for Character Level Language Modeling |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Programming With a Differentiable Forth Interpreter |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
1 |
| Pruning Convolutional Neural Networks for Resource Efficient Inference |
❌ |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
5 |
| Pruning Filters for Efficient ConvNets |
❌ |
❌ |
✅ |
✅ |
✅ |
✅ |
✅ |
5 |
| Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Quasi-Recurrent Neural Networks |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Query-Reduction Networks for Question Answering |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Reasoning with Memory Augmented Neural Networks for Language Comprehension |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Recurrent Batch Normalization |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Recurrent Environment Simulators |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Recurrent Hidden Semi-Markov Model |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Recurrent Mixture Density Network for Spatiotemporal Visual Attention |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Recurrent Normalization Propagation |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Recursive Regression with Neural Networks: Approximating the HJI PDE Solution |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Regularizing CNNs with Locally Constrained Decorrelations |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU |
❌ |
✅ |
✅ |
❌ |
✅ |
✅ |
✅ |
5 |
| Reinforcement Learning with Unsupervised Auxiliary Tasks |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| RenderGAN: Generating Realistic Labeled Data |
❌ |
✅ |
❌ |
✅ |
❌ |
❌ |
✅ |
3 |
| Revisiting Classifier Two-Sample Tests |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| SGDR: Stochastic Gradient Descent with Warm Restarts |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Sample Efficient Actor-Critic with Experience Replay |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| SampleRNN: An Unconditional End-to-End Neural Audio Generation Model |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Semi-Supervised Classification with Graph Convolutional Networks |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Semi-supervised deep learning by metric embedding |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
5 |
| Shift Aggregate Extract Networks |
✅ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Short and Deep: Sketching and Neural Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Sigma Delta Quantized Networks |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Snapshot Ensembles: Train 1, Get M for Free |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
✅ |
5 |
| Soft Weight-Sharing for Neural Network Compression |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Song From PI: A Musically Plausible Network for Pop Music Generation |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Sparsely-Connected Neural Networks: Towards Efficient VLSI Implementation of Deep Neural Networks |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Steerable CNNs |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
1 |
| Stick-Breaking Variational Autoencoders |
❌ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
5 |
| Stochastic Neural Networks for Hierarchical Reinforcement Learning |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Structured Attention Networks |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| Support Regularized Sparse Coding and Its Fast Encoder |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning |
✅ |
❌ |
✅ |
❌ |
✅ |
❌ |
✅ |
4 |
| Symmetry-Breaking Convergence Analysis of Certain Two-layered Neural Networks with ReLU nonlinearity |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Temporal Ensembling for Semi-Supervised Learning |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| The Neural Noisy Channel |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Third Person Imitation Learning |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
3 |
| Tighter bounds lead to improved classifiers |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Topology and Geometry of Half-Rectified Network Optimization |
✅ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
4 |
| Towards Deep Interpretability (MUS-ROVER II): Learning Hierarchical Representations of Tonal Music |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Towards Principled Methods for Training Generative Adversarial Networks |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Towards a Neural Statistician |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Towards an automatic Turing test: Learning to evaluate dialogue responses |
❌ |
❌ |
✅ |
✅ |
✅ |
❌ |
✅ |
4 |
| Towards the Limit of Network Quantization |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Tracking the World State with Recurrent Entity Networks |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Trained Ternary Quantization |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning |
❌ |
❌ |
❌ |
❌ |
✅ |
❌ |
✅ |
2 |
| Training Compressed Fully-Connected Networks with a Density-Diversity Penalty |
✅ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Training deep neural-networks using a noise adaptation layer |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Transfer of View-manifold Learning to Similarity Perception of Novel Objects |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Tree-structured decoding with doubly-recurrent neural networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Trusting SVM for Piecewise Linear CNNs |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
6 |
| Tuning Recurrent Neural Networks with Reinforcement Learning |
❌ |
✅ |
❌ |
❌ |
❌ |
❌ |
✅ |
2 |
| Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Understanding Trainable Sparse Coding with Matrix Factorization |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Understanding deep learning requires rethinking generalization |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Unrolled Generative Adversarial Networks |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Unsupervised Cross-Domain Image Generation |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Unsupervised Perceptual Rewards for Imitation Learning |
✅ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| Variable Computation in Recurrent Neural Networks |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Variational Lossy Autoencoder |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Variational Recurrent Adversarial Deep Domain Adaptation |
❌ |
❌ |
✅ |
✅ |
❌ |
❌ |
✅ |
3 |
| Visualizing Deep Neural Network Decisions: Prediction Difference Analysis |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
5 |
| What does it take to generate natural textures? |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |
| Why Deep Neural Networks for Function Approximation? |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
0 |
| Words or Characters? Fine-grained Gating for Reading Comprehension |
❌ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
4 |
| Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations |
❌ |
✅ |
✅ |
❌ |
❌ |
❌ |
✅ |
3 |
| beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework |
❌ |
❌ |
✅ |
❌ |
❌ |
❌ |
✅ |
2 |