Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

International Conference on Learning Representations (ICLR) - 2020

Documentation Rate of Empirical Papers by Reproducibility Variable

Distribution of Empirical Papers by Number of Documented Variables

Website:

Venue Year Papers
Reproducibility Score Reproducibility Score based on Gundersen et al. (2025). See Methods for details.
Documentation Score Documentation Score is the average score over the seven reproducibility variables for empirical research papers. See Methods for details.
% Empirical Percentage of papers that are empirical research vs theoretical research.
% Industry Percentage of empirical research papers with at least one author from Industry.
Website
ICLR 2020 687 0.56 3.79 97.67% 55.89%
Pseudocode
Open Source Code
Open Datasets
Dataset Splits
Hardware Specification
Software Dependencies
Experiment Setup
A Baseline for Few-Shot Image Classification 4
A Closer Look at Deep Policy Gradients 3
A Closer Look at the Optimization Landscapes of Generative Adversarial Networks 3
A Constructive Prediction of the Generalization Error Across Scales 3
A FRAMEWORK FOR ROBUSTNESS CERTIFICATION OF SMOOTHED CLASSIFIERS USING F-DIVERGENCES 3
A Fair Comparison of Graph Neural Networks for Graph Classification 5
A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case 0
A Generalized Training Approach for Multiagent Learning 3
A Latent Morphology Model for Open-Vocabulary Neural Machine Translation 5
A Learning-based Iterative Method for Solving Vehicle Routing Problems 3
A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms 4
A Mutual Information Maximization Perspective of Language Representation Learning 3
A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning 3
A Probabilistic Formulation of Unsupervised Text Style Transfer 4
A Signal Propagation Perspective for Pruning Neural Networks at Initialization 4
A Stochastic Derivative Free Optimization Method with Momentum 4
A Target-Agnostic Attack on Deep Models: Exploiting Security Vulnerabilities of Transfer Learning 6
A Theoretical Analysis of the Number of Shots in Few-Shot Learning 4
A Theory of Usable Information under Computational Constraints 3
A closer look at the approximation capabilities of neural networks 0
A critical analysis of self-supervision, or what we can learn from a single image 4
AE-OT: A NEW GENERATIVE MODEL BASED ON EXTENDED SEMI-DISCRETE OPTIMAL TRANSPORT 3
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations 5
AMRL: Aggregated Memory For Reinforcement Learning 2
Abductive Commonsense Reasoning 4
Abstract Diagrammatic Reasoning with Multiplex Graph Networks 4
Accelerating SGD with momentum for over-parameterized learning 4
Action Semantics Network: Considering the Effects of Actions in Multiagent Systems 3
Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games 1
Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation 5
Adaptive Structural Fingerprints for Graph Attention Networks 4
Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural Networks 5
Adjustable Real-time Style Transfer 4
AdvectiveNet: An Eulerian-Lagrangian Fluidic Reservoir for Point Cloud Processing 5
Adversarial AutoAugment 5
Adversarial Lipschitz Regularization 3
Adversarial Policies: Attacking Deep Reinforcement Learning 4
Adversarial Training and Provable Defenses: Bridging the Gap 7
Adversarially Robust Representations with Smooth Encoders 3
Adversarially robust transfer learning 4
An Exponential Learning Rate Schedule for Deep Learning 2
An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality 4
Analysis of Video Feature Learning in Two-Stream CNNs on the Example of Zebrafish Swim Bout Classification 6
And the Bit Goes Down: Revisiting the Quantization of Neural Networks 5
Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction 5
Are Transformers universal approximators of sequence-to-sequence functions? 4
AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures 4
Asymptotics of Wide Networks from Feynman Diagrams 2
At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks? 2
AtomNAS: Fine-Grained End-to-End Neural Architecture Search 6
AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty 4
Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space 3
Augmenting Non-Collaborative Dialog Systems with Explicit Semantic and Strategic Dialog History 3
AutoQ: Automated Kernel-Wise Neural Network Quantization 3
Automated Relational Meta-learning 4
Automated curriculum generation through setter-solver interactions 3
Automatically Discovering and Learning New Visual Categories with Ranking Statistics 6
B-Spline CNNs on Lie groups 3
BERTScore: Evaluating Text Generation with BERT 5
BREAKING CERTIFIED DEFENSES: SEMANTIC ADVERSARIAL EXAMPLES WITH SPOOFED ROBUSTNESS CERTIFICATES 3
BackPACK: Packing more into Backprop 4
Batch-shaping for learning conditional channel gated networks 5
BatchEnsemble: an Alternative Approach to Efficient Ensemble and Lifelong Learning 4
BayesOpt Adversarial Attack 4
Bayesian Meta Sampling for Fast Uncertainty Adaptation 5
Behaviour Suite for Reinforcement Learning 3
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks 0
BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations 5
Biologically inspired sleep algorithm for increased generalization and adversarial robustness in deep neural networks 3
Black-Box Adversarial Attack with Transferable Model-based Embedding 6
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning 3
BlockSwap: Fisher-guided Block Substitution for Network Compression on a Budget 5
Bounds on Over-Parameterization for Guaranteed Existence of Descent Paths in Shallow ReLU Networks 0
Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness 5
Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints 4
Building Deep Equivariant Capsule Networks 4
CAQL: Continuous Action Q-Learning 4
CATER: A diagnostic dataset for Compositional Actions & TEmporal Reasoning 4
CLEVRER: Collision Events for Video Representation and Reasoning 3
CLN2INV: Learning Loop Invariants with Continuous Logic Networks 5
CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning 4
Can gradient clipping mitigate label noise? 3
Capsules with Inverted Dot-Product Attention Routing 4
Causal Discovery with Reinforcement Learning 3
Certified Defenses for Adversarial Patches 4
Certified Robustness for Top-k Predictions against Adversarial Perturbations via Randomized Smoothing 4
Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation 5
Classification-Based Anomaly Detection for General Data 4
Co-Attentive Equivariant Neural Networks: Focusing Equivariance On Transformations Co-Occurring in Data 3
CoPhy: Counterfactual Learning of Physical Dynamics 3
Coherent Gradients: An Approach to Understanding Generalization in Gradient Descent-based Optimization 2
Combining Q-Learning and Search with Amortized Value Estimates 5
Comparing Rewinding and Fine-tuning in Neural Network Pruning 6
Composing Task-Agnostic Policies with Deep Reinforcement Learning 3
Composition-based Multi-Relational Graph Convolutional Networks 4
Compositional Language Continual Learning 4
Compositional languages emerge in a neural iterated learning model 4
Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network 2
Compressive Transformers for Long-Range Sequence Modelling 6
Computation Reallocation for Object Detection 5
Conditional Learning of Fair Representations 3
Conservative Uncertainty Estimation By Fitting Prior Networks 3
Consistency Regularization for Generative Adversarial Networks 5
Continual Learning with Adaptive Weights (CLAW) 4
Continual Learning with Bayesian Neural Networks for Non-Stationary Data 3
Continual learning with hypernetworks 4
Contrastive Learning of Structured World Models 4
Contrastive Representation Distillation 5
Controlling generative models with continuous factors of variations 5
Convergence of Gradient Methods on Bilinear Zero-Sum Games 1
Convolutional Conditional Neural Processes 4
Counterfactuals uncover the modular structure of deep generative models 3
Critical initialisation in continuous approximations of binary neural networks 2
Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation 5
Cross-Lingual Ability of Multilingual BERT: An Empirical Study 4
Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework 4
Curriculum Loss: Robust Learning and Generalization against Label Corruption 3
Curvature Graph Network 4
Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning 4
DBA: Distributed Backdoor Attacks against Federated Learning 3
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames 7
DDSP: Differentiable Digital Signal Processing 3
Data-Independent Neural Pruning via Coresets 5
Data-dependent Gaussian Prior Objective for Language Generation 3
DeFINE: Deep Factorized Input Token Embeddings for Neural Sequence Modeling 5
Decentralized Deep Learning with Arbitrary Communication Compression 6
Decoding As Dynamic Programming For Recurrent Autoregressive Models 3
Decoupling Representation and Classifier for Long-Tailed Recognition 4
Deep 3D Pan via local adaptive "t-shaped" convolutions with global and local adaptive dilations 3
Deep Audio Priors Emerge From Harmonic Convolutional Networks 4
Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds 4
Deep Double Descent: Where Bigger Models and More Data Hurt 3
Deep Graph Matching Consensus 5
Deep Imitative Models for Flexible Inference, Planning, and Control 3
Deep Learning For Symbolic Mathematics 3
Deep Learning of Determinantal Point Processes via Proper Spectral Sub-gradient 5
Deep Network Classification by Scattering and Homotopy Dictionary Learning 4
Deep Orientation Uncertainty Learning based on a Bingham Loss 4
Deep Semi-Supervised Anomaly Detection 5
Deep Symbolic Superoptimization Without Human Knowledge 5
Deep neuroethology of a virtual rodent 3
Deep probabilistic subsampling for task-adaptive compressed sensing 5
DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures 5
DeepSphere: a graph-based spherical CNN 3
DeepV2D: Video to Depth with Differentiable Structure from Motion 4
Defending Against Physically Realizable Attacks on Image Classification 5
Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation 3
Demystifying Inter-Class Disentanglement 3
Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators 3
Depth-Adaptive Transformer 4
Depth-Width Trade-offs for ReLU Networks via Sharkovsky's Theorem 1
Detecting Extrapolation with Local Ensembles 5
Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions 3
DiffTaichi: Differentiable Programming for Physical Simulation 4
Difference-Seeking Generative Adversarial Network--Unseen Sample Generation 4
Differentiable Reasoning over a Virtual Knowledge Base 5
Differentiable learning of numerical rules in knowledge graphs 4
Differentially Private Meta-Learning 4
Differentiation of Blackbox Combinatorial Solvers 4
Directional Message Passing for Molecular Graphs 4
Disagreement-Regularized Imitation Learning 2
Discovering Motor Programs by Recomposing Demonstrations 3
Discrepancy Ratio: Evaluating Model Performance When Even Experts Disagree on the Truth 3
Discriminative Particle Filter Reinforcement Learning for Complex Partial observations 7
Disentanglement by Nonlinear ICA with General Incompressible-flow Networks (GIN) 2
Disentangling Factors of Variations Using Few Labels 4
Disentangling neural mechanisms for perceptual grouping 3
Distance-Based Learning from Errors for Confidence Calibration 3
Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication 1
Distributionally Robust Neural Networks 5
Diverse Trajectory Forecasting with Determinantal Point Processes 3
DivideMix: Learning with Noisy Labels as Semi-supervised Learning 6
Domain Adaptive Multibranch Networks 3
Don't Use Large Mini-batches, Use Local SGD 6
Double Neural Counterfactual Regret Minimization 4
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation 3
Drawing Early-Bird Tickets: Toward More Efficient Training of Deep Networks 5
Dream to Control: Learning Behaviors by Latent Imagination 5
DropEdge: Towards Deep Graph Convolutional Networks on Node Classification 5
Duration-of-Stay Storage Assignment under Uncertainty 5
Dynamic Model Pruning with Feedback 5
Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers 2
Dynamic Time Lag Regression: Predicting What & When 4
Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery 3
Dynamically Pruned Message Passing Networks for Large-scale Knowledge Graph Reasoning 6
Dynamics-Aware Embeddings 2
Dynamics-Aware Unsupervised Discovery of Skills 4
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators 4
EMPIR: Ensembles of Mixed Precision Deep Networks for Increased Robustness Against Adversarial Attacks 3
ES-MAML: Simple Hessian-Free Meta Learning 3
Economy Statistical Recurrent Units For Inferring Nonlinear Granger Causality 4
Editable Neural Networks 6
Effect of Activation Functions on the Training of Overparametrized Neural Nets 2
Efficient Probabilistic Logic Reasoning with Graph Neural Networks 6
Efficient Riemannian Optimization on the Stiefel Manifold via the Cayley Transform 5
Efficient and Information-Preserving Future Frame Prediction and Beyond 3
Emergence of functional and structural properties of the head direction system by optimization of recurrent neural networks 1
Emergent Tool Use From Multi-Agent Autocurricula 2
Empirical Bayes Transductive Meta-Learning with Synthetic Gradients 5
Empirical Studies on the Properties of Linear Regions in Deep Neural Networks 2
Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation 6
Encoding word order in complex embeddings 6
End to End Trainable Active Contours via Differentiable Rendering 5
Energy-based models for atomic-resolution protein conformations 6
Enhancing Adversarial Defense by k-Winners-Take-All 3
Enhancing Transformation-Based Defenses Against Adversarial Attacks with a Distribution Classifier 3
Ensemble Distribution Distillation 3
Environmental drivers of systematicity and generalization in a situated agent 1
Episodic Reinforcement Learning with Associative Memory 3
Escaping Saddle Points Faster with Stochastic Momentum 2
Estimating Gradients for Discrete Random Variables by Sampling without Replacement 4
Estimating counterfactual treatment outcomes over time through adversarially balanced representations 6
Evaluating The Search Phase of Neural Architecture Search 3
Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning 4
Expected Information Maximization: Using the I-Projection for Mixture Density Estimation 5
Explain Your Move: Understanding Agent Actions Using Specific and Relevant Feature Attribution 4
Explanation by Progressive Exaggeration 2
Exploration in Reinforcement Learning with Deep Covering Options 3
Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps for Deep Reinforcement Learning 3
Exploring Model-based Planning with Policy Networks 4
Extreme Classification via Adversarial Softmax Approximation 4
Extreme Tensoring for Low-Memory Preconditioning 4
FEW-SHOT LEARNING ON GRAPHS VIA SUPER-CLASSES BASED ON GRAPH SPECTRAL MEASURES 4
FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary 4
FSPool: Learning Set Representations with Featurewise Sort Pooling 5
Fair Resource Allocation in Federated Learning 7
Fantastic Generalization Measures and Where to Find Them 3
Fast Neural Network Adaptation via Parameter Remapping and Architecture Search 6
Fast Task Inference with Variational Intrinsic Successor Features 3
Fast is better than free: Revisiting adversarial training 5
FasterSeg: Searching for Faster Real-time Semantic Segmentation 6
Feature Interaction Interpretability: A Case for Explaining Ad-Recommendation Systems via Neural Interaction Detection 6
Federated Adversarial Domain Adaptation 5
Federated Learning with Matched Averaging 5
Few-shot Text Classification with Distributional Signatures 6
Finding and Visualizing Weaknesses of Deep Reinforcement Learning Agents 4
Finite Depth and Width Corrections to the Neural Tangent Kernel 0
Fooling Detection Alone is Not Enough: Adversarial Attack against Multiple Object Tracking 4
Four Things Everyone Should Know to Improve Batch Normalization 5
FreeLB: Enhanced Adversarial Training for Natural Language Understanding 5
Frequency-based Search-control in Dyna 4
From Inference to Generation: End-to-end Fully Self-supervised Generation of Human Face from Speech 3
From Variational to Deterministic Autoencoders 4
Functional Regularisation for Continual Learning with Gaussian Processes 4
Functional vs. parametric equivalence of ReLU networks 0
GAT: Generative Adversarial Training for Adversarial Example Detection and Robust Classification 6
GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations 4
GLAD: Learning Sparse Graph Recovery 6
Gap-Aware Mitigation of Gradient Staleness 5
GenDICE: Generalized Offline Estimation of Stationary Values 4
Generalization bounds for deep convolutional neural networks 2
Generalization of Two-layer Neural Networks: An Asymptotic Viewpoint 1
Generalization through Memorization: Nearest Neighbor Language Models 4
Generalized Convolutional Forest Networks for Domain Generalization and Visual Recognition 4
Generative Models for Effective ML on Private, Decentralized Datasets 5
Generative Ratio Matching Networks 4
Geom-GCN: Geometric Graph Convolutional Networks 3
Geometric Analysis of Nonconvex Optimization Landscapes for Overcomplete Learning 2
Geometric Insights into the Convergence of Nonlinear TD Learning 0
Global Relational Models of Source Code 5
Gradient $\ell_1$ Regularization for Quantization Robustness 3
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks 3
Gradient-Based Neural DAG Learning 4
Gradientless Descent: High-Dimensional Zeroth-Order Optimization 3
Gradients as Features for Deep Representation Learning 3
Graph Constrained Reinforcement Learning for Natural Language Action Spaces 3
Graph Convolutional Reinforcement Learning 3
Graph Neural Networks Exponentially Lose Expressive Power for Node Classification 5
Graph inference learning for semi-supervised classification 3
GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation 5
GraphSAINT: Graph Sampling Based Inductive Learning Method 7
GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding 5
Guiding Program Synthesis by Learning to Generate Examples 3
HOPPITY: LEARNING GRAPH TRANSFORMATIONS TO DETECT AND FIX BUGS IN PROGRAMS 4
Hamiltonian Generative Networks 2
Harnessing Structures for Value-Based Planning and Reinforcement Learning 3
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks 3
HiLLoC: lossless image compression with hierarchical latent variable models 5
Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation 4
High Fidelity Speech Synthesis with Adversarial Networks 4
Higher-Order Function Networks for Learning Composable 3D Object Representations 5
How much Position Information Do Convolutional Neural Networks Encode? 2
How to 0wn the NAS in Your Spare Time 5
Hyper-SAGNN: a self-attention based graph neural network for hypergraphs 3
Hypermodels for Exploration 1
I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively 5
IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks 3
Identifying through Flows for Recovering Latent Representations 1
Identity Crisis: Memorization and Generalization Under Extreme Overparameterization 2
Image-guided Neural Object Rendering 3
Imitation Learning via Off-Policy Distribution Matching 4
Implementation Matters in Deep RL: A Case Study on PPO and TRPO 4
Implementing Inductive bias for different navigation tasks through diverse RNN attrractors 1
Implicit Bias of Gradient Descent based Adversarial Training on Separable Data 3
Improved Sample Complexities for Deep Neural Networks and Robust Classification via an All-Layer Margin 4
Improved memory in recurrent neural networks with sequential non-normal dynamics 4
Improving Adversarial Robustness Requires Revisiting Misclassified Examples 3
Improving Generalization in Meta Reinforcement Learning using Learned Objectives 5
Improving Neural Language Generation with Spectrum Control 4
In Search for a SAT-friendly Binarized Neural Network Architecture 2
Incorporating BERT into Neural Machine Translation 5
Inductive Matrix Completion Based on Graph Neural Networks 5
Inductive and Unsupervised Representation Learning on Graph Structured Objects 2
Inductive representation learning on temporal graphs 4
Infinite-Horizon Differentiable Model Predictive Control 3
Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies 2
Influence-Based Multi-Agent Exploration 2
InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization 3
Information Geometry of Orthogonal Initializations and Training 3
Input Complexity and Out-of-distribution Detection with Likelihood-based Generative Models 5
Intensity-Free Learning of Temporal Point Processes 4
Interpretable Complex-Valued Neural Networks for Privacy Protection 2
Intriguing Properties of Adversarial Training at Scale 2
Intrinsic Motivation for Encouraging Synergistic Behavior 2
Intrinsically Motivated Discovery of Diverse Patterns in Self-Organizing Systems 4
Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning? 0
Iterative energy-based projection on a normal data manifold for anomaly localization 2
Jacobian Adversarially Regularized Networks for Robustness 4
Jelly Bean World: A Testbed for Never-Ending Learning 4
Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps 6
Keep Doing What Worked: Behavior Modelling Priors for Offline Reinforcement Learning 3
Kernel of CycleGAN as a principal homogeneous space 2
Kernelized Wasserstein Natural Gradient 3
Knowledge Consistency between Neural Networks and Beyond 2
LAMOL: LAnguage MOdeling for Lifelong Language Learning 3
LEARNED STEP SIZE QUANTIZATION 4
LEARNING EXECUTION THROUGH NEURAL CODE FUSION 3
Lagrangian Fluid Simulation with Continuous Convolutions 5
LambdaNet: Probabilistic Type Inference using Graph Neural Networks 4
Language GANs Falling Short 3
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes 6
Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings 3
Lazy-CFR: fast and near-optimal regret minimization for extensive games with imperfect information 3
Learn to Explain Efficiently via Neural Logic Inductive Learning 4
Learning Compositional Koopman Operators for Model-Based Control 2
Learning Disentangled Representations for CounterFactual Regression 1
Learning Efficient Parameter Server Synchronization Policies for Distributed SGD 3
Learning Expensive Coordination: An Event-Based Deep RL Approach 3
Learning Heuristics for Quantified Boolean Formulas through Reinforcement Learning 5
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech 3
Learning Nearly Decomposable Value Functions Via Communication Minimization 3
Learning Robust Representations via Multi-View Information Bottleneck 6
Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling 3
Learning Space Partitions for Nearest Neighbor Search 3
Learning The Difference That Makes A Difference With Counterfactually-Augmented Data 4
Learning To Explore Using Active Neural SLAM 4
Learning deep graph matching with channel-independent embedding and Hungarian attention 2
Learning from Explanations with Neural Execution Tree 4
Learning from Rules Generalizing Labeled Exemplars 5
Learning from Unlabelled Videos Using Contrastive Predictive Neural 3D Mapping 5
Learning representations for binary-classification without backpropagation 4
Learning the Arrow of Time for Problems in Reinforcement Learning 4
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks 4
Learning to Control PDEs with Differentiable Physics 4
Learning to Coordinate Manipulation Skills via Skill Behavior Diversification 4
Learning to Group: A Bottom-Up Framework for 3D Part Discovery in Unseen Categories 4
Learning to Guide Random Search 5
Learning to Learn by Zeroth-Order Oracle 3
Learning to Link 2
Learning to Move with Affordance Maps 1
Learning to Plan in High Dimensions via Neural Exploration-Exploitation Trees 2
Learning to Represent Programs with Property Signatures 3
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering 4
Learning to solve the credit assignment problem 3
Learning transport cost from subset correspondence 4
Learning-Augmented Data Stream Algorithms 3
Linear Symmetric Quantization of Neural Networks for Low-precision Integer Hardware 4
Lipschitz constant estimation of Neural Networks via sparse polynomial optimization 4
Lite Transformer with Long-Short Range Attention 5
Locality and Compositionality in Zero-Shot Learning 3
Logic and the 2-Simplicial Transformer 4
Lookahead: A Far-sighted Alternative of Magnitude-based Pruning 5
Low-Resource Knowledge-Grounded Dialogue Generation 3
Low-dimensional statistical manifold embedding of directed graphs 5
MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius 5
MEMO: A Deep Network for Flexible Combination of Episodic Memories 3
MMA Training: Direct Input Space Margin Maximization through Adversarial Training 5
Making Efficient Use of Demonstrations to Solve Hard Exploration Problems 3
Making Sense of Reinforcement Learning and Probabilistic Inference 3
Masked Based Unsupervised Content Transfer 3
Massively Multilingual Sparse Word Representations 6
Mathematical Reasoning in Latent Space 3
Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning 2
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning 4
Measuring Compositional Generalization: A Comprehensive Method on Realistic Data 4
Measuring and Improving the Use of Graph Information in Graph Neural Networks 3
Measuring the Reliability of Reinforcement Learning Algorithms 3
Memory-Based Graph Networks 4
Meta Dropout: Learning to Perturb Latent Features for Generalization 4
Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies 3
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples 4
Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization 3
Meta-Learning Deep Energy-Based Memory Models 4
Meta-Learning with Warped Gradient Descent 5
Meta-Learning without Memorization 5
Meta-Q-Learning 4
Meta-learning curiosity algorithms 4
MetaPix: Few-Shot Video Retargeting 4
Minimizing FLOPs to Learn Efficient Sparse Representations 5
Mirror-Generative Neural Machine Translation 5
Mixed Precision DNNs: All you need is a good parametrization 5
Mixed-curvature Variational Autoencoders 4
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models 3
Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks 5
Model Based Reinforcement Learning for Atari 5
Model-Augmented Actor-Critic: Backpropagating through Paths 4
Model-based reinforcement learning for biological sequence design 4
Mogrifier LSTM 4
Monotonic Multihead Attention 5
Multi-Agent Interactions Modeling with Correlated Policies 4
Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells 4
Multi-agent Reinforcement Learning for Networked System Control 4
Multilingual Alignment of Contextual Word Representations 3
Multiplicative Interactions and Where to Find Them 3
Mutual Information Gradient Estimation for Representation Learning 4
Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification 5
N-BEATS: Neural basis expansion analysis for interpretable time series forecasting 3
NAS evaluation is frustratingly hard 6
NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search 6
NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search 5
Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks 4
Network Deconvolution 4
Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning 5
NeurQuRI: Neural Question Requirement Inspector for Answerability Prediction in Machine Reading Comprehension 3
Neural Arithmetic Units 6
Neural Epitome Search for Architecture-Agnostic Network Compression 3
Neural Execution of Graph Algorithms 2
Neural Machine Translation with Universal Visual Representation 7
Neural Module Networks for Reasoning over Text 4
Neural Network Branching for Neural Network Verification 5
Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data 6
Neural Outlier Rejection for Self-Supervised Keypoint Learning 4
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence 1
Neural Stored-program Memory 3
Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension 4
Neural Tangents: Fast and Easy Infinite Neural Networks in Python 3
Neural Text Generation With Unlikelihood Training 3
Neural tangent kernels, transportation mappings, and universal approximation 0
Never Give Up: Learning Directed Exploration Strategies 3
Non-Autoregressive Dialog State Tracking 4
Novelty Detection Via Blurring 3
Oblique Decision Trees from Derivatives of ReLU Networks 4
Observational Overfitting in Reinforcement Learning 2
On Bonus Based Exploration Methods In The Arcade Learning Environment 2
On Computation and Generalization of Generative Adversarial Imitation Learning 1
On Generalization Error Bounds of Noisy Gradient Methods for Non-Convex Learning 4
On Identifiability in Transformers 4
On Mutual Information Maximization for Representation Learning 3
On Robustness of Neural Ordinary Differential Equations 2
On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach 4
On Universal Equivariant Set Networks 5
On the "steerability" of generative adversarial networks 3
On the Convergence of FedAvg on Non-IID Data 2
On the Equivalence between Positional Node Embeddings and Structural Graph Representations 7
On the Global Convergence of Training Deep Linear ResNets 2
On the Need for Topology-Aware Generative Models for Manifold-Based Defenses 1
On the Relationship between Self-Attention and Convolutional Layers 3
On the Variance of the Adaptive Learning Rate and Beyond 5
On the Weaknesses of Reinforcement Learning for Neural Machine Translation 3
On the interaction between supervision and self-play in emergent communication 4
Once-for-All: Train One Network and Specialize it for Efficient Deployment 6
One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation 5
Online and stochastic optimization beyond Lipschitz continuity: A Riemannian approach 1
Optimal Strategies Against Generative Attacks 4
Optimistic Exploration even with a Pessimistic Initialisation 4
Option Discovery using Deep Skill Chaining 5
Order Learning and Its Application to Age Estimation 4
Overlearning Reveals Sensitive Attributes 3
PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction 5
PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search 5
PCMC-Net: Feature-based Pairwise Choice Markov Chains 4
PROGRESSIVE LEARNING AND DISENTANGLEMENT OF HIERARCHICAL REPRESENTATIONS 3
Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks 5
PairNorm: Tackling Oversmoothing in GNNs 5
Pay Attention to Features, Transfer Learn Faster CNNs 3
Permutation Equivariant Models for Compositional Generalization in Language 4
Phase Transitions for the Information Bottleneck in Representation Learning 3
Physics-as-Inverse-Graphics: Unsupervised Physical Parameter Estimation from Video 2
Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics 5
Picking Winning Tickets Before Training by Preserving Gradient Flow 4
Piecewise linear activations substantially shape the loss surfaces of neural networks 0
Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning 5
Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP 2
Plug and Play Language Models: A Simple Approach to Controlled Text Generation 4
Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring 6
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks 0
Population-Guided Parallel Policy Search for Reinforcement Learning 4
Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information 3
Pre-training Tasks for Embedding-based Large-scale Retrieval 4
Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations 3
Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks 4
Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control 2
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model 4
Principled Weight Initialization for Hypernetworks 2
Probabilistic Connection Importance Inference and Lossless Compression of Deep Neural Networks 4
Probability Calibration for Knowledge Graph Embedding Models 6
Program Guided Agent 4
Progressive Memory Banks for Incremental Domain Adaptation 5
Projection-Based Constrained Policy Optimization 4
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks 1
Provable Filter Pruning for Efficient Neural Networks 6
Provable robustness against all adversarial $l_p$-perturbations for $p\geq 1$ 3
ProxSGD: Training Structured Neural Networks under Regularization and Constraints 4
Pruned Graph Scattering Transforms 3
Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving 5
Pure and Spurious Critical Points: a Geometric Study of Linear Networks 1
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP 1
Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel 6
Quantifying the Cost of Reliable Photo Authentication via High-Performance Learned Lossy Representations 5
Quantum Algorithms for Deep Convolutional Neural Networks 3
Query-efficient Meta Attack to Deep Neural Networks 4
Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings 4
RGBD-GAN: Unsupervised 3D Representation Learning From Natural Image Datasets via RGBD Image Synthesis 3
RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments 2
RNA Secondary Structure Prediction By Learning Unrolled Algorithms 6
RNNs Incrementally Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients? 6
RTFM: Generalising to New Environment Dynamics via Reading 2
RaCT: Toward Amortized Ranking-Critical Training For Collaborative Filtering 5
RaPP: Novelty Detection with Reconstruction along Projection Pathway 4
Ranking Policy Gradient 4
Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML 5
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning 3
ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring 5
Real or Not Real, that is the Question 3
Reanalysis of Variance Reduced Temporal Difference Learning 3
Reconstructing continuous distributions of 3D protein structure from cryo-EM images 4
Recurrent neural circuits for contour detection 5
Reducing Transformer Depth on Demand with Structured Dropout 5
Reformer: The Efficient Transformer 4
Regularizing activations in neural networks via distribution matching with the Wasserstein metric 5
Reinforced Genetic Algorithm Learning for Optimizing Computation Graphs 4
Reinforced active learning for image segmentation 3
Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation 4
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives 3
Relational State-Space Model for Stochastic Multi-Object Systems 4
Residual Energy-Based Models for Text Generation 5
Restricting the Flow: Information Bottlenecks for Attribution 5
Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness 5
Rethinking the Hyperparameters for Fine-tuning 4
Revisiting Self-Training for Neural Sequence Generation 5
Ridge Regression: Structure, Cross-Validation, and Sketching 4
Robust And Interpretable Blind Image Denoising Via Bias-Free Convolutional Neural Networks 2
Robust Local Features for Improving the Generalization of Adversarial Training 4
Robust Reinforcement Learning for Continuous Control with Model Misspecification 4
Robust Subspace Recovery Layer for Unsupervised Anomaly Detection 5
Robust anomaly detection and backdoor attack detection via differential privacy 4
Robust training with ensemble consensus 3
Robustness Verification for Transformers 4
Rotation-invariant clustering of neuronal responses in primary visual cortex 2
Rényi Fair Inference 4
SAdam: A Variant of Adam for Strongly Convex Functions 3
SCALOR: Generative World Models with Scalable Object Representations 3
SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference 5
SELF: Learning to Filter Noisy Labels with Self-Ensembling 4
SNODE: Spectral Discretization of Neural ODEs for System Identification 3
SNOW: Subscribing to Knowledge via Channel Pooling for Transfer & Lifelong Learning of Convolutional Neural Networks 5
SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition 5
SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards 3
SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models 4
SVQN: Sequential Variational Soft Q-Learning Networks 2
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction 3
Sampling-Free Learning of Bayesian Quantized Neural Networks 2
Scalable Model Compression by Entropy Penalized Reparameterization 3
Scalable Neural Methods for Reasoning With a Symbolic Knowledge Base 5
Scalable and Order-robust Continual Learning with Additive Parameter Decomposition 4
Scale-Equivariant Steerable Networks 5
Scaling Autoregressive Video Models 4
Selection via Proxy: Efficient Data Selection for Deep Learning 6
Self-Adversarial Learning with Comparative Discrimination for Text Generation 3
Self-Supervised Learning of Appliance Usage 4
Self-labelling via simultaneous clustering and representation learning 4
Semantically-Guided Representation Learning for Self-Supervised Monocular Depth 4
Semi-Supervised Generative Modeling for Controllable Speech Synthesis 4
Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue 4
Sharing Knowledge in Multi-Task Deep Reinforcement Learning 4
Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks 3
Short and Sparse Deconvolution --- A Geometric Approach 4
Sign Bits Are All You Need for Black-Box Attacks 6
Sign-OPT: A Query-Efficient Hard-label Adversarial Attack 4
Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee 2
Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning 4
Single Episode Policy Transfer in Reinforcement Learning 5
Skip Connections Matter: On the Transferability of Adversarial Examples Generated with ResNets 3
Sliced Cramer Synaptic Consolidation for Preserving Deeply Learned Representations 3
SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum 5
Smooth markets: A basic mechanism for organizing gradient-based learners 0
Smoothness and Stability in GANs 2
Span Recovery for Deep Neural Networks with Applications to Input Obfuscation 3
Sparse Coding with Gated Learned ISTA 4
Spectral Embedding of Regularized Block Models 3
Spike-based causal inference for weight alignment 2
SpikeGrad: An ANN-equivalent Computation Model for Implementing Backpropagation with Spikes 5
Stable Rank Normalization for Improved Generalization in Neural Networks and GANs 4
State Alignment-based Imitation Learning 3
State-only Imitation with Transition Dynamics Mismatch 4
Stochastic AUC Maximization with Deep Neural Networks 4
Stochastic Conditional Generative Networks with Basis Decomposition 3
Stochastic Weight Averaging in Parallel: Large-Batch Training That Generalizes Well 4
Strategies for Pre-training Graph Neural Networks 5
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding 4
StructPool: Structured Graph Pooling via Conditional Random Fields 5
Structured Object-Aware Physics Prediction for Video Modeling and Planning 3
Sub-policy Adaptation for Hierarchical Reinforcement Learning 3
Symplectic ODE-Net: Learning Hamiltonian Dynamics with Control 3
Symplectic Recurrent Neural Networks 2
Synthesizing Programmatic Policies that Inductively Generalize 2
TabFact: A Large-scale Dataset for Table-based Fact Verification 6
Target-Embedding Autoencoders for Supervised Representation Learning 4
Tensor Decompositions for Temporal Knowledge Base Completion 5
The Break-Even Point on Optimization Trajectories of Deep Neural Networks 3
The Curious Case of Neural Text Degeneration 3
The Early Phase of Neural Network Training 2
The Gambler's Problem and Beyond 0
The Implicit Bias of Depth: How Incremental Learning Drives Generalization 2
The Ingredients of Real World Robotic Reinforcement Learning 2
The Local Elasticity of Neural Networks 3
The Logical Expressiveness of Graph Neural Networks 4
The Shape of Data: Intrinsic Distance for Data Distributions 5
The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget 3
The asymptotic spectrum of the Hessian of DNN throughout training 2
The intriguing role of module criticality in the generalization of deep networks 2
Theory and Evaluation Metrics for Learning Disentangled Representations 3
Thieves on Sesame Street! Model Extraction of BERT-based APIs 2
Thinking While Moving: Deep Reinforcement Learning with Concurrent Control 4
To Relieve Your Headache of Training an MRF, Take AdVIL 5
Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous Control 3
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets 4
Towards Fast Adaptation of Neural Architectures with Meta Learning 6
Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models 2
Towards Stabilizing Batch Statistics in Backward Propagation of Batch Normalization 4
Towards Stable and Efficient Training of Verifiably Robust Neural Networks 4
Towards Verified Robustness under Text Deletion Interventions 3
Towards a Deep Network Architecture for Structured Smoothness 5
Towards neural networks that provably know when they don't know 4
Training Generative Adversarial Networks from Incomplete Observations using Factorised Discriminators 4
Training Recurrent Neural Networks Online by Learning Explicit State Variables 3
Training binary neural networks with real-to-binary convolutions 4
Training individually fair ML models with sensitive subspace robustness 5
Tranquil Clouds: Neural Networks for Learning Temporally Coherent Features in Point Clouds 1
Transferable Perturbations of Deep Feature Distributions 3
Transferring Optimality Across Data Distributions via Homotopy Methods 3
Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention 4
Tree-Structured Attention with Hierarchical Accumulation 5
Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference 4
Truth or backpropaganda? An empirical investigation of deep learning theory 2
U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation 3
Unbiased Contrastive Divergence Algorithm for Training Energy-Based Latent Variable Models 5
Uncertainty-guided Continual Learning with Bayesian Neural Networks 5
Understanding Architectures Learnt by Cell-based Neural Architecture Search 3
Understanding Generalization in Recurrent Neural Networks 0
Understanding Knowledge Distillation in Non-autoregressive Machine Translation 4
Understanding Why Neural Networks Generalize Well Through GSNR of Parameters 2
Understanding and Improving Information Transfer in Multi-Task Learning 5
Understanding and Robustifying Differentiable Architecture Search 5
Understanding l4-based Dictionary Learning: Interpretation, Stability, and Robustness 3
Understanding the Limitations of Conditional Generative Models 2
Understanding the Limitations of Variational Mutual Information Estimators 2
Universal Approximation with Certified Networks 0
Unpaired Point Cloud Completion on Real Scans using Adversarial Training 2
Unrestricted Adversarial Examples via Semantic Manipulation 2
Unsupervised Clustering using Pseudo-semi-supervised Learning 4
Unsupervised Model Selection for Variational Disentangled Representation Learning 3
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control 4
V4D: 4D Convolutional Neural Networks for Video-level Representation Learning 4
VL-BERT: Pre-training of Generic Visual-Linguistic Representations 5
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning 4
Variance Reduction With Sparse Gradients 3
Variational Autoencoders for Highly Multivariate Spatial Point Processes Intensities 5
Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling 6
Variational Recurrent Models for Solving Partially Observable Control Tasks 5
Variational Template Machine for Data-to-Text Generation 5
Vid2Game: Controllable Characters Extracted from Real-World Videos 2
VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation 4
Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search 5
Watch, Try, Learn: Meta-Learning from Demonstrations and Rewards 5
Weakly Supervised Clustering by Exploiting Unique Class Count 4
Weakly Supervised Disentanglement with Guarantees 3
What Can Neural Networks Reason About? 3
What graph neural networks cannot learn: depth vs width 2
White Noise Analysis of Neural Networks 4
Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity 4
Why Not to Use Zero Imputation? Correcting Sparsity Bias in Training Neural Networks 5
You CAN Teach an Old Dog New Tricks! On Training Knowledge Graph Embeddings 4
You Only Train Once: Loss-Conditional Training of Deep Networks 4
Your classifier is secretly an energy based model and you should treat it like one 3
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations 4
word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement 5