Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

International Conference on Learning Representations (ICLR) - 2021

Documentation Rate of Empirical Papers by Reproducibility Variable

Distribution of Empirical Papers by Number of Documented Variables

Website:

Venue Year Papers
Reproducibility Score Reproducibility Score based on Gundersen et al. (2025). See Methods for details.
Documentation Score Documentation Score is the average score over the seven reproducibility variables for empirical research papers. See Methods for details.
% Empirical Percentage of papers that are empirical research vs theoretical research.
% Industry Percentage of empirical research papers with at least one author from Industry.
Website
ICLR 2021 859 0.6 4.04 98.02% 51.43%
Pseudocode
Open Source Code
Open Datasets
Dataset Splits
Hardware Specification
Software Dependencies
Experiment Setup
$i$-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning 5
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning 4
A Block Minifloat Representation for Training Deep Neural Networks 4
A Critique of Self-Expressive Deep Subspace Clustering 2
A Design Space Study for LISTA and Beyond 5
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima 2
A Discriminative Gaussian Mixture Model with Sparsity 3
A Distributional Approach to Controlled Text Generation 6
A Geometric Analysis of Deep Generative Image Models and Its Applications 3
A Good Image Generator Is What You Need for High-Resolution Video Synthesis 5
A Gradient Flow Framework For Analyzing Network Pruning 2
A Hypergradient Approach to Robust Regression without Correspondence 4
A Learning Theoretic Perspective on Local Explainability 2
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks 4
A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks 1
A Panda? No, It's a Sloth: Slowdown Attacks on Adaptive Multi-Exit Neural Network Inference 4
A Temporal Kernel Approach for Deep Learning with Continuous-time Information 6
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention 5
A Unified Approach to Interpreting and Boosting Adversarial Transferability 6
A Universal Representation Transformer Layer for Few-Shot Image Classification 6
A Wigner-Eckart Theorem for Group Equivariant Convolution Kernels 0
A statistical theory of cold posteriors in deep neural networks 2
A teacher-student framework to distill future trajectories 3
A unifying view on implicit bias in training linear neural networks 1
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning 4
ANOCE: Analysis of Causal Effects with Multiple Mediators via Constrained Structural Learning 5
ARMOURED: Adversarially Robust MOdels using Unlabeled data by REgularizing Diversity 4
AUXILIARY TASK UPDATE DECOMPOSITION: THE GOOD, THE BAD AND THE NEUTRAL 5
Accelerating Convergence of Replica Exchange Stochastic Gradient MCMC via Variance Reduction 4
Accurate Learning of Graph Representations with Graph Multiset Pooling 4
Achieving Linear Speedup with Partial Worker Participation in Non-IID Federated Learning 4
Acting in Delayed Environments with Non-Stationary Markov Policies 4
Activation-level uncertainty in deep neural networks 5
Active Contrastive Learning of Audio-Visual Video Representations 6
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition 5
AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models 5
AdaSpeech: Adaptive Text to Speech for Custom Voice 4
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights 6
Adapting to Reward Progressivity via Spectral Reinforcement Learning 4
Adaptive Extra-Gradient Methods for Min-Max Optimization and Games 2
Adaptive Federated Optimization 5
Adaptive Procedural Task Generation for Hard-Exploration Problems 5
Adaptive Universal Generalized PageRank Graph Neural Network 5
Adaptive and Generative Zero-Shot Learning 3
Adversarial score matching and improved sampling for image generation 5
Adversarially Guided Actor-Critic 3
Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification 3
Aligning AI With Shared Human Values 3
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 5
An Unsupervised Deep Learning Approach for Real-World Image Denoising 4
Analyzing the Expressive Power of Graph Neural Networks in a Spectral Perspective 4
Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics 2
Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies 6
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval 5
Anytime Sampling for Autoregressive Models via Ordered Autoencoding 4
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval 5
Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks 4
Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees? 3
Are wider nets better given the same number of parameters? 4
Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning 4
Async-RED: A Provably Convergent Asynchronous Block Parallel Stochastic Method using Deep Denoising Priors 3
Attentional Constellation Nets for Few-Shot Learning 4
Auction Learning as a Two-Player Game 2
Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting 3
Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation 6
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly 6
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization 4
Autoregressive Entity Retrieval 4
Auxiliary Learning by Implicit Differentiation 6
Average-case Acceleration for Bilinear Games and Normal Matrices 1
BERTology Meets Biology: Interpreting Attention in Protein Language Models 5
BOIL: Towards Representation Change for Few-shot Learning 3
BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction 5
BREEDS: Benchmarks for Subpopulation Shift 5
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization 4
BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration 4
Bag of Tricks for Adversarial Training 4
Balancing Constraints and Rewards with Meta-Gradient D4PG 4
Batch Reinforcement Learning Through Continuation Method 3
Bayesian Context Aggregation for Neural Processes 3
Bayesian Few-Shot Classification with One-vs-Each Pólya-Gamma Augmented Gaussian Processes 5
Behavioral Cloning from Noisy Demonstrations 3
Benchmarks for Deep Off-Policy Evaluation 3
Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods 0
Better Fine-Tuning by Reducing Representational Collapse 3
Beyond Categorical Label Representations for Image Classification 5
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters 3
BiPointNet: Binary Neural Network for Point Clouds 5
Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech 7
Blending MPC & Value Function Approximation for Efficient Reinforcement Learning 2
Boost then Convolve: Gradient Boosting Meets Graph Neural Networks 5
Bowtie Networks: Generative Modeling for Joint Few-Shot Recognition and Novel-View Synthesis 4
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification 4
Byzantine-Resilient Non-Convex Stochastic Gradient Descent 3
C-Learning: Horizon-Aware Cumulative Accessibility Estimation 4
C-Learning: Learning to Achieve Goals via Recursive Classification 5
CO2: Consistent Contrast for Unsupervised Visual Representation Learning 3
CPR: Classifier-Projection Regularization for Continual Learning 2
CPT: Efficient Deep Neural Network Training via Cyclic Precision 5
CT-Net: Channel Tensorization Network for Video Classification 3
CaPC Learning: Confidential and Private Collaborative Learning 3
Calibration of Neural Networks using Splines 4
Calibration tests beyond classification 5
Can a Fruit Fly Learn Word Embeddings? 4
Capturing Label Characteristics in VAEs 3
Categorical Normalizing Flows via Continuous Transformations 7
CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning 3
CcGAN: Continuous Conditional Generative Adversarial Networks for Image Generation 4
Certify or Predict: Boosting Certified Robustness with Compositional Architectures 5
Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions 1
Characterizing signal propagation to close the performance gap in unnormalized ResNets 5
ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations 6
Clairvoyance: A Pipeline Toolkit for Medical Time Series 6
Class Normalization for (Continual)? Generalized Zero-Shot Learning 5
Clustering-friendly Representation Learning via Instance Discrimination and Feature Decorrelation 3
Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity 5
CoCo: Controllable Counterfactuals for Evaluating Dialogue State Trackers 4
CoCon: A Self-Supervised Approach for Controlled Text Generation 5
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding 3
Collective Robustness Certificates: Exploiting Interdependence in Graph Neural Networks 5
Colorization Transformer 5
Combining Ensembles and Data Augmentation Can Harm Your Calibration 5
Combining Label Propagation and Simple Models out-performs Graph Neural Networks 3
Combining Physics and Machine Learning for Network Flow Estimation 6
Communication in Multi-Agent Reinforcement Learning: Intention Sharing 2
CompOFA – Compound Once-For-All Networks for Faster Multi-Platform Deployment 5
Complex Query Answering with Neural Link Predictors 4
Computational Separation Between Convolutional and Fully-Connected Networks 2
Concept Learners for Few-Shot Learning 4
Conditional Generative Modeling via Learning the Latent Space 4
Conditional Negative Sampling for Contrastive Learning of Visual Representations 5
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data 6
Conformation-Guided Molecular Representation with Hamiltonian Neural Networks 4
Conservative Safety Critics for Exploration 2
Contemplating Real-World Object Classification 2
Contextual Dropout: An Efficient Sample-Dependent Dropout Module 5
Contextual Transformation Networks for Online Continual Learning 7
Continual learning in recurrent neural networks 2
Continuous Wasserstein-2 Barycenter Estimation without Minimax Optimization 5
Contrastive Learning with Adversarial Perturbations for Conditional Text Generation 3
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning 4
Contrastive Divergence Learning is a Time Reversal Adversarial Game 3
Contrastive Explanations for Reinforcement Learning via Embedded Self Predictions 4
Contrastive Learning with Hard Negative Samples 5
Contrastive Syn-to-Real Generalization 4
Control-Aware Representations for Model-based Reinforcement Learning 3
Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization 3
Convex Regularization behind Neural Reconstruction 3
Coping with Label Shift via Distributionally Robust Optimisation 4
CopulaGNN: Towards Integrating Representational and Correlational Roles of Graphs in Graph Neural Networks 5
Correcting experience replay for multi-agent communication 3
Counterfactual Generative Networks 4
Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies 4
Creative Sketch Generation 5
Cross-Attentional Audio-Visual Fusion for Weakly-Supervised Action Localization 3
Cut out the annotator, keep the cutout: better segmentation with weak supervision 3
DARTS-: Robustly Stepping out of Performance Collapse Without Indicators 6
DC3: A learning method for optimization with hard constraints 6
DDPNOpt: Differential Dynamic Programming Neural Optimizer 4
DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION 6
DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation 4
DINO: A Conditional Energy-Based GAN for Domain Translation 4
DOP: Off-Policy Multi-Agent Decomposed Policy Gradients 4
Data-Efficient Reinforcement Learning with Self-Predictive Representations 5
Dataset Condensation with Gradient Matching 6
Dataset Inference: Ownership Resolution in Machine Learning 4
Dataset Meta-Learning from Kernel Ridge-Regression 5
DeLighT: Deep and Light-weight Transformer 6
Debiasing Concept-based Explanations with Causal Analysis 4
Decentralized Attribution of Generative Models 5
Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach 4
Deconstructing the Regularization of BatchNorm 3
Decoupling Global and Local Representations via Invertible Generative Flows 4
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation 6
Deep Equals Shallow for ReLU Networks in Kernel Regimes 4
Deep Learning meets Projective Clustering 5
Deep Networks and the Multiple Manifold Problem 0
Deep Neural Network Fingerprinting by Conferrable Adversarial Examples 4
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS 0
Deep Partition Aggregation: Provable Defenses against General Poisoning Attacks 4
Deep Repulsive Clustering of Ordered Data Based on Order-Identity Decomposition 4
Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients 5
DeepAveragers: Offline Reinforcement Learning By Solving Derived Non-Parametric MDPs 4
Deformable DETR: Deformable Transformers for End-to-End Object Detection 5
Degree-Quant: Quantization-Aware Training for Graph Neural Networks 7
Denoising Diffusion Implicit Models 3
Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization 6
DialoGraph: Incorporating Interpretable Strategy-Graph Networks into Negotiation Dialogues 5
DiffWave: A Versatile Diffusion Model for Audio Synthesis 4
Differentiable Segmentation of Sequences 4
Differentiable Trust Region Layers for Deep Reinforcement Learning 5
Differentially Private Learning Needs Better Features (or Much More Data) 6
Directed Acyclic Graph Neural Networks 4
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate 2
Disambiguating Symbolic Expressions in Informal Documents 4
Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization 4
Discovering Non-monotonic Autoregressive Orderings with Variational Inference 6
Discovering a set of policies for the worst case reward 3
Discrete Graph Structure Learning for Forecasting Multiple Time Series 5
Disentangled Recurrent Wasserstein Autoencoder 4
Disentangling 3D Prototypical Networks for Few-Shot Concept Learning 4
Distance-Based Regularisation of Deep Networks for Fine-Tuning 4
Distilling Knowledge from Reader to Retriever for Question Answering 4
Distributed Momentum for Byzantine-resilient Stochastic Gradient Descent 5
Distributional Sliced-Wasserstein and Applications to Generative Modeling 3
Diverse Video Generation using a Gaussian Process Trigger 2
Do 2D GANs Know 3D Shape? Unsupervised 3D Shape Reconstruction from 2D Image GANs 3
Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth 2
Do not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning 6
Does enhanced shape bias improve neural network robustness to common corruptions? 2
Domain Generalization with MixStyle 5
Domain-Robust Visual Imitation Learning with Mutual Information Constraints 4
DrNAS: Dirichlet Neural Architecture Search 5
Drop-Bottleneck: Learning Discrete Compressed Representation for Noise-Robust Exploration 2
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling 3
DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation 4
Dynamic Tensor Rematerialization 5
ECONOMIC HYPERPARAMETER OPTIMIZATION WITH BLENDED SEARCH STRATEGY 7
EEC: Learning to Encode and Regenerate Images for Continual Learning 5
EVALUATION OF NEURAL ARCHITECTURES TRAINED WITH SQUARE LOSS VS CROSS-ENTROPY IN CLASSIFICATION TASKS 4
Early Stopping in Deep Networks: Double Descent and How to Eliminate it 3
Effective Abstract Reasoning with Dual-Contrast Network 5
Effective Distributed Learning with Random Features: Improved Bounds and Algorithms 4
Effective and Efficient Vote Attack on Capsule Networks 4
Efficient Certified Defenses Against Patch Attacks on Image Classifiers 4
Efficient Conformal Prediction via Cascaded Inference with Expanded Admission 5
Efficient Continual Learning with Modular Networks and Task-Driven Priors 6
Efficient Empowerment Estimation for Unsupervised Stabilization 4
Efficient Generalized Spherical CNNs 2
Efficient Inference of Flexible Interaction in Spiking-neuron Networks 4
Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL 1
Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation 3
Efficient Wasserstein Natural Gradients for Reinforcement Learning 4
EigenGame: PCA as a Nash Equilibrium 4
Emergent Road Rules In Multi-Agent Driving Environments 3
Emergent Symbols through Binding in External Memory 3
Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition 4
Empirical or Invariant Risk Minimization? A Sample Complexity Perspective 4
End-to-End Egospheric Spatial Memory 7
End-to-end Adversarial Text-to-Speech 5
Enforcing robust control guarantees within neural network policies 5
Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation 3
Entropic gradient descent algorithms and wide flat minima 5
Estimating Lipschitz constants of monotone deep equilibrium models 3
Estimating and Evaluating Regression Predictive Uncertainty in Deep Object Detectors 5
Estimating informativeness of samples with Smooth Unique Information 4
Evaluating the Disentanglement of Deep Generative Models through Manifold Topology 4
Evaluation of Similarity-based Explanations 4
Evaluations and Methods for Explanation through Robustness Analysis 5
Evolving Reinforcement Learning Algorithms 4
Exemplary Natural Images Explain CNN Activations Better than State-of-the-Art Feature Visualization 6
Explainable Deep One-Class Classification 6
Explainable Subgraph Reasoning for Forecasting on Temporal Knowledge Graphs 5
Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning 3
Explaining the Efficacy of Counterfactually Augmented Data 3
Exploring Balanced Feature Spaces for Representation Learning 4
Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit 5
Expressive Power of Invariant and Equivariant Graph Neural Networks 4
Extracting Strong Policies for Robotics Tasks from Zero-Order Trajectory Optimizers 4
Extreme Memorization via Scale of Initialization 4
FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization 5
Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments 4
Fair Mixup: Fairness via Interpolation 4
FairBatch: Batch Selection for Model Fairness 5
FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders 4
Fantastic Four: Differentiable and Efficient Bounds on Singular Values of Convolution Layers 5
Fast And Slow Learning Of Recurrent Independent Mechanisms 2
Fast Geometric Projections for Local Robustness Certification 5
Fast and Complete: Enabling Complete Neural Network Verification with Rapid and Massively Parallel Incomplete Verifiers 5
Fast convergence of stochastic subgradient method under interpolation 3
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech 4
Faster Binary Embeddings for Preserving Euclidean Distances 4
FedBE: Making Bayesian Model Ensemble Applicable to Federated Learning 5
FedBN: Federated Learning on Non-IID Features via Local Batch Normalization 4
FedMix: Approximation of Mixup under Mean Augmented Federated Learning 4
Federated Learning Based on Dynamic Regularization 3
Federated Learning via Posterior Averaging: A New Perspective and Practical Algorithms 4
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning 5
Few-Shot Bayesian Optimization with Deep Kernel Surrogates 4
Few-Shot Learning via Learning the Representation, Provably 0
Fidelity-based Deep Adiabatic Scheduling 1
Filtered Inner Product Projection for Crosslingual Embedding Alignment 5
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis 6
Fooling a Complete Neural Network Verifier 5
For self-supervised learning, Rationality implies generalization, provably 3
Fourier Neural Operator for Parametric Partial Differential Equations 2
Free Lunch for Few-shot Learning: Distribution Calibration 5
Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders 4
Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online 5
GAN "Steerability" without optimization 1
GAN2GAN: Generative Noise Learning for Blind Denoising with Single Noisy Images 5
GANs Can Play Lottery Tickets Too 3
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding 4
Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs 3
Generalization bounds via distillation 2
Generalization in data-driven models of primary visual cortex 4
Generalized Energy Based Models 5
Generalized Multimodal ELBO 4
Generalized Variational Continual Learning 4
Generating Adversarial Computer Programs using Optimized Obfuscations 3
Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains 3
Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule 5
Generative Scene Graph Networks 4
Generative Time-series Modeling with Fourier Flows 2
Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning 3
Geometry-Aware Gradient Algorithms for Neural Architecture Search 6
Geometry-aware Instance-reweighted Adversarial Training 4
Getting a CLUE: A Method for Explaining Uncertainty Estimates 5
Global Convergence of Three-layer Neural Networks in the Mean Field Regime 0
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime 1
Go with the flow: Adaptive control for Neural ODEs 5
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing 5
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability 2
Gradient Origin Networks 5
Gradient Projection Memory for Continual Learning 6
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models 4
Graph Coarsening with Neural Networks 5
Graph Convolution with Low-rank Learnable Local Filters 5
Graph Edit Networks 6
Graph Information Bottleneck for Subgraph Recognition 4
Graph Traversal with Tensor Functionals: A Meta-Algorithm for Scalable Learning 6
Graph-Based Continual Learning 2
GraphCodeBERT: Pre-training Code Representations with Data Flow 5
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity 2
Grounded Language Learning Fast and Slow 2
Grounding Language to Autonomously-Acquired Skills via Goal Generation 3
Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning 3
Group Equivariant Conditional Neural Processes 4
Group Equivariant Generative Adversarial Networks 4
Group Equivariant Stand-Alone Self-Attention For Vision 4
Growing Efficient Deep Networks by Structured Continuous Sparsification 4
HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark 4
HalentNet: Multimodal Trajectory Forecasting with Hallucinative Intents 6
Heating up decision boundaries: isocapacitory saturation, adversarial scenarios and generalization bounds 7
HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients 3
Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization 6
Hierarchical Autoregressive Modeling for Neural Video Compression 4
Hierarchical Reinforcement Learning by Discovering Intrinsic Options 4
High-Capacity Expert Binary Networks 4
Hopfield Networks is All You Need 5
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning 3
How Benign is Benign Overfitting ? 2
How Does Mixup Help With Robustness and Generalization? 3
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks? 2
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks 2
How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision 5
Human-Level Performance in No-Press Diplomacy via Equilibrium Search 2
HyperDynamics: Meta-Learning Object and Agent Dynamics with Hypernetworks 2
HyperGrid Transformers: Towards A Single Model for Multiple Tasks 4
Hyperbolic Neural Networks++ 3
IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression 4
IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning 4
INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving 6
IOT: Instance-wise Layer Reordering for Transformer Structures 5
Identifying Physical Law of Hamiltonian Systems via Meta-Learning 4
Identifying nonlinear dynamical systems with multiple time scales and long-range dependencies 3
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels 6
Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering 3
Impact of Representation Learning in Linear Bandits 3
Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time 2
Implicit Gradient Regularization 2
Implicit Normalizing Flows 6
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning 4
Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors 4
Improved Autoregressive Modeling with Distribution Smoothing 2
Improved Estimation of Concentration Under $\ell_p$-Norm Distance Metrics Using Half Spaces 4
Improving Adversarial Robustness via Channel-wise Activation Suppressing 4
Improving Relational Regularized Autoencoders with Spherical Sliced Fused Gromov Wasserstein 4
Improving Transformation Invariance in Contrastive Representation Learning 5
Improving VAEs' Robustness to Adversarial Attack 2
Improving Zero-Shot Voice Style Transfer via Disentangled Representation Learning 4
In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning 4
In Search of Lost Domain Generalization 5
In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness 5
Incorporating Symmetry into Deep Dynamics Models for Improved Generalization 5
Incremental few-shot learning via vector quantization in deep embedded space 4
Individually Fair Gradient Boosting 5
Individually Fair Rankings 4
Inductive Representation Learning in Temporal Networks via Causal Anonymous Walks 6
Influence Estimation for Generative Adversarial Networks 4
Influence Functions in Deep Learning Are Fragile 2
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective 6
Information Laundering for Model Privacy 4
Initialization and Regularization of Factorized Neural Layers 3
Integrating Categorical Semantics into Unsupervised Domain Translation 4
Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling 5
Interpretable Models for Granger Causality Using Self-explaining Neural Networks 6
Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels 6
Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking 5
Interpreting Knowledge Graph Relation Representation from Word Embeddings 3
Interpreting and Boosting Dropout from a Game-Theoretic View 3
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds 5
Intraclass clustering: an implicit learning ability that regularizes DNNs 2
Intrinsic-Extrinsic Convolution and Pooling for Learning on 3D Protein Structures 4
Is Attention Better Than Matrix Decomposition? 6
Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study 4
IsarStep: a Benchmark for High-level Mathematical Reasoning 5
Isometric Propagation Network for Generalized Zero-shot Learning 3
Isometric Transformation Invariant and Equivariant Graph Convolutional Networks 6
Isotropy in the Contextual Embedding Space: Clusters and Manifolds 2
Iterated learning for emergent systematicity in VQA 5
Iterative Empirical Game Solving via Single Policy Best Response 3
Kanerva++: Extending the Kanerva Machine With Differentiable, Locally Block Allocated Latent Memory 3
Knowledge Distillation as Semiparametric Inference 4
Knowledge distillation via softmax regression representation learning 4
LEAF: A Learnable Frontend for Audio Classification 4
LambdaNetworks: Modeling long-range Interactions without Attention 5
Language-Agnostic Representation Learning of Source Code from Structure and Context 4
Large Associative Memory Problem in Neurobiology and Machine Learning 0
Large Batch Simulation for Deep Reinforcement Learning 5
Large Scale Image Completion via Co-Modulated Generative Adversarial Networks 5
Large-width functional asymptotics for deep Gaussian neural networks 0
Latent Convergent Cross Mapping 4
Latent Skill Planning for Exploration and Transfer 3
Layer-adaptive Sparsity for the Magnitude-based Pruning 4
Learnable Embedding sizes for Recommender Systems 6
Learning "What-if" Explanations for Sequential Decision-Making 5
Learning A Minimax Optimizer: A Pilot Study 5
Learning Accurate Entropy Model with Global Reference for Image Compression 2
Learning Associative Inference Using Fast Weight Memory 6
Learning Better Structured Representations Using Low-rank Adaptive Label Smoothing 4
Learning Cross-Domain Correspondence for Control with Dynamics Cycle-Consistency 3
Learning Deep Features in Instrumental Variable Regression 6
Learning Energy-Based Generative Models via Coarse-to-Fine Expanding and Sampling 4
Learning Energy-Based Models by Diffusion Recovery Likelihood 4
Learning Generalizable Visual Representations via Interactive Gameplay 5
Learning Hyperbolic Representations of Topological Features 5
Learning Incompressible Fluid Dynamics from Scratch - Towards Fast, Differentiable Fluid Models that Generalize 3
Learning Invariant Representations for Reinforcement Learning without Reconstruction 4
Learning Long-term Visual Dynamics with Region Proposal Interaction Networks 5
Learning Manifold Patch-Based Representations of Man-Made Shapes 4
Learning Mesh-Based Simulation with Graph Networks 4
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch 6
Learning Neural Event Functions for Ordinary Differential Equations 6
Learning Neural Generative Dynamics for Molecular Conformation Generation 6
Learning Parametrised Graph Shift Operators 3
Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues 5
Learning Robust State Abstractions for Hidden-Parameter Block MDPs 4
Learning Safe Multi-agent Control with Decentralized Neural Barrier Certificates 2
Learning Structural Edits via Incremental Tree Transformations 5
Learning Subgoal Representations with Slow Dynamics 4
Learning Task Decomposition with Ordered Memory Policy Network 3
Learning Task-General Representations with Generative Neuro-Symbolic Modeling 3
Learning Value Functions in Deep Policy Gradients using Residual Variance 3
Learning What To Do by Simulating the Past 3
Learning a Latent Search Space for Routing Problems using Variational Autoencoders 5
Learning a Latent Simplex in Input Sparsity Time 4
Learning advanced mathematical computations from examples 4
Learning and Evaluating Representations for Deep One-Class Classification 4
Learning continuous-time PDEs from sparse data with graph neural networks 2
Learning explanations that are hard to vary 5
Learning from Demonstration with Weakly Supervised Disentanglement 4
Learning from Protein Structure with Geometric Vector Perceptrons 7
Learning from others' mistakes: Avoiding dataset biases without modeling them 4
Learning perturbation sets for robust machine learning 6
Learning the Pareto Front with Hypernetworks 6
Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation 4
Learning to Generate 3D Shapes with Generative Cellular Automata 4
Learning to Make Decisions via Submodular Regularization 5
Learning to Reach Goals via Iterated Supervised Learning 4
Learning to Recombine and Resample Data For Compositional Generalization 6
Learning to Represent Action Values as a Hypergraph on the Action Vertices 3
Learning to Sample with Local and Global Contexts in Experience Replay Buffer 4
Learning to Set Waypoints for Audio-Visual Navigation 3
Learning to live with Dale's principle: ANNs with separate excitatory and inhibitory units 5
Learning with AMIGo: Adversarially Motivated Intrinsic Goals 3
Learning with Feature-Dependent Label Noise: A Progressive Approach 5
Learning with Instance-Dependent Label Noise: A Sample Sieve Approach 5
Learning-based Support Estimation in Sublinear Time 3
Lifelong Learning of Compositional Structures 5
LiftPool: Bidirectional ConvNet Pooling 4
Linear Convergent Decentralized Optimization with Compression 3
Linear Last-iterate Convergence in Constrained Saddle-point Optimization 1
Linear Mode Connectivity in Multitask and Continual Learning 5
Lipschitz Recurrent Neural Networks 4
Local Convergence Analysis of Gradient Descent Ascent with Finite Timescale Separation 3
Local Search Algorithms for Rank-Constrained Convex Optimization 5
Locally Free Weight Sharing for Network Width Search 5
Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning 6
Long Range Arena : A Benchmark for Efficient Transformers 4
Long-tail learning via logit adjustment 2
Long-tailed Recognition by Routing Diverse Distribution-Aware Experts 5
Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search 5
Lossless Compression of Structured Convolutional Models via Lifting 3
LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition 5
MALI: A memory efficient and reverse accurate integrator for Neural ODEs 5
MARS: Markov Molecular Sampling for Multi-objective Drug Discovery 5
MELR: Meta-Learning via Modeling Episode-Level Relationships for Few-Shot Learning 4
MIROSTAT: A NEURAL TEXT DECODING ALGORITHM THAT DIRECTLY CONTROLS PERPLEXITY 4
MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space 4
MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training 4
Mapping the Timescale Organization of Neural Language Models 3
Mastering Atari with Discrete World Models 5
Mathematical Reasoning via Self-supervised Skip-tree Training 4
Measuring Massive Multitask Language Understanding 4
Memory Optimization for Deep Networks 6
Meta Back-Translation 4
Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning 5
Meta-Learning of Structured Task Distributions in Humans and Machines 2
Meta-Learning with Neural Tangent Kernels 4
Meta-learning Symmetries by Reparameterization 6
Meta-learning with negative learning rates 2
MetaNorm: Learning to Normalize Few-Shot Batches Across Domains 6
MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering 5
Mind the Gap when Conditioning Amortised Inference in Sequential Latent-Variable Models 3
Mind the Pad -- CNNs Can Develop Blind Spots 2
Minimum Width for Universal Approximation 0
MixKD: Towards Efficient Distillation of Large-scale Language Models 4
Mixed-Features Vectors and Subspace Splitting 1
MoPro: Webly Supervised Learning with Momentum Prototypes 5
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond 3
Model Patching: Closing the Subgroup Performance Gap with Data Augmentation 5
Model-Based Offline Planning 5
Model-Based Visual Planning with Self-Supervised Functional Distances 3
Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose? 4
Modeling the Second Player in Distributionally Robust Optimization 4
Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System 5
Molecule Optimization by Explainable Evolution 5
Monotonic Kronecker-Factored Lattice 6
Monte-Carlo Planning and Learning with Language Action Value Estimates 4
More or Less: When and How to Build Convolutional Neural Network Ensembles 3
Multi-Class Uncertainty Calibration via Mutual Information Maximization-based Binning 5
Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks 5
Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network 4
Multi-Time Attention Networks for Irregularly Sampled Time Series 5
Multi-resolution modeling of a discrete stochastic process identifies causes of cancer 4
Multi-timescale Representation Learning in LSTM Language Models 5
MultiModalQA: complex question answering over text, tables and images 3
Multiplicative Filter Networks 3
Multiscale Score Matching for Out-of-Distribution Detection 4
Multivariate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows 6
Mutual Information State Intrinsic Control 4
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control 3
NAS-Bench-ASR: Reproducible Neural Architecture Search for Speech Recognition 5
NBDT: Neural-Backed Decision Tree 4
NOVAS: Non-convex Optimization via Adaptive Stochastic Search for End-to-end Learning and Control 6
NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation 5
Nearest Neighbor Machine Translation 3
Negative Data Augmentation 3
Net-DNF: Effective Deep Modeling of Tabular Data 6
Network Pruning That Matters: A Case Study on Retraining Variants 3
Neural Approximate Sufficient Statistics for Implicit Models 4
Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective 6
Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks 4
Neural Delay Differential Equations 4
Neural Jump Ordinary Differential Equations: Consistent Continuous-Time Prediction and Filtering 5
Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces 6
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics 3
Neural Networks for Learning Counterfactual G-Invariances from Single Environments 5
Neural ODE Processes 6
Neural Pruning via Growing Regularization 5
Neural Spatio-Temporal Point Processes 4
Neural Synthesis of Binaural Speech From Mono Audio 5
Neural Thompson Sampling 3
Neural Topic Model via Optimal Transport 4
Neural gradients are near-lognormal: improved quantized and sparse training 4
Neural networks with late-phase weights 6
Neural representation and generation for RNA secondary structures 4
Neurally Augmented ALISTA 4
New Bounds For Distributed Mean Estimation and Variance Reduction 1
No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks 3
No MCMC for me: Amortized sampling for fast and stable training of energy-based models 5
Noise against noise: stochastic label noise helps combat inherent label noise 5
Noise or Signal: The Role of Image Backgrounds in Object Recognition 3
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds 3
Nonseparable Symplectic Neural Networks 3
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning 3
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers 3
Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation 4
On Data-Augmentation and Consistency-Based Semi-Supervised Learning 1
On Dyadic Fairness: Exploring and Mitigating Bias in Graph Connections 4
On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning 5
On Graph Neural Networks versus Graph-Augmented MLPs 3
On InstaHide, Phase Retrieval, and Sparse Matrix Factorization 3
On Learning Universal Representations Across Languages 4
On Position Embeddings in BERT 3
On Self-Supervised Image Representations for GAN Evaluation 3
On Statistical Bias In Active Learning: How and When to Fix It 4
On the Bottleneck of Graph Neural Networks and its Practical Implications 4
On the Critical Role of Conventions in Adaptive Human-AI Collaboration 4
On the Curse of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis 0
On the Dynamics of Training Attention Models 3
On the Impossibility of Global Convergence in Multi-Loss Optimization 2
On the Origin of Implicit Regularization in Stochastic Gradient Descent 2
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines 4
On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers 3
On the Transfer of Disentangled Representations in Realistic Settings 3
On the Universality of Rotation Equivariant Point Cloud Networks 4
On the Universality of the Double Descent Peak in Ridgeless Regression 2
On the geometry of generalization and memorization in deep neural networks 2
On the mapping between Hopfield networks and Restricted Boltzmann Machines 2
On the role of planning in model-based deep reinforcement learning 4
One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks 3
Online Adversarial Purification based on Self-supervised Learning 5
Open Question Answering over Tables and Text 4
Optimal Conversion of Conventional Artificial Neural Networks to Spiking Neural Networks 4
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime 2
Optimal Regularization can Mitigate Double Descent 2
Optimism in Reinforcement Learning with Generalized Linear Function Approximation 1
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning 4
Orthogonalizing Convolutional Layers with the Cayley Transform 6
Overfitting for Fun and Profit: Instance-Adaptive Data Compression 4
Overparameterisation and worst-case generalisation: friend or foe? 3
PAC Confidence Predictions for Deep Neural Network Classifiers 3
PC2WF: 3D Wireframe Reconstruction from Raw Point Clouds 2
PDE-Driven Spatiotemporal Disentanglement 5
PMI-Masking: Principled masking of correlated spans 3
PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences 4
Parameter Efficient Multimodal Transformers for Video Representation Learning 3
Parameter-Based Value Functions 6
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning 3
Partitioned Learned Bloom Filters 4
Perceptual Adversarial Robustness: Defense Against Unseen Threat Models 6
Personalized Federated Learning with First Order Model Optimization 3
Physics-aware, probabilistic model order reduction with guaranteed stability 1
Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks 2
Planning from Pixels using Inverse Dynamics Models 4
PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics 4
PolarNet: Learning to Optimize Polar Keypoints for Keypoint Based Object Detection 5
Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial Examples 6
Practical Massively Parallel Monte-Carlo Tree Search Applied to Molecular Design 5
Practical Real Time Recurrent Learning with a Sparse Approximation 3
Pre-training Text-to-Text Transformers for Concept-centric Common Sense 5
Predicting Classification Accuracy When Adding New Unobserved Classes 6
Predicting Inductive Biases of Pre-Trained Models 3
Predicting Infectiousness for Proactive Contact Tracing 3
Prediction and generalisation over directed actions by grid cells 3
Primal Wasserstein Imitation Learning 5
Private Image Reconstruction from System Side Channels Using Generative Models 5
Private Post-GAN Boosting 4
Probabilistic Numeric Convolutional Neural Networks 3
Probing BERT in Hyperbolic Spaces 4
Progressive Skeletonization: Trimming more fat from a network at initialization 5
Projected Latent Markov Chain Monte Carlo: Conditional Sampling of Normalizing Flows 4
Property Controllable Variational Autoencoder via Invertible Mutual Dependence 4
Protecting DNNs from Theft using an Ensemble of Diverse Models 3
Prototypical Contrastive Learning of Unsupervised Representations 6
Prototypical Representation Learning for Relation Extraction 6
Provable Rich Observation Reinforcement Learning with Combinatorial Latent States 5
Provably robust classification of adversarial examples with detection 5
Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry 1
Pruning Neural Networks at Initialization: Why Are We Missing the Mark? 2
PseudoSeg: Designing Pseudo Labels for Semantic Segmentation 4
QPLEX: Duplex Dueling Multi-Agent Q-Learning 3
Quantifying Differences in Reward Functions 4
R-GAP: Recursive Gradient Attack on Privacy 5
RMSprop converges with proper hyper-parameter 4
RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs 5
RODE: Learning Roles to Decompose Multi-Agent Tasks 3
Random Feature Attention 5
Randomized Automatic Differentiation 6
Randomized Ensembled Double Q-Learning: Learning Fast Without a Model 5
Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments 4
Rao-Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator 3
Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets 5
Rapid Task-Solving in Novel Environments 2
Recurrent Independent Mechanisms 4
Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks 2
Refining Deep Generative Models via Discriminator Gradient Flow 5
Regularization Matters in Policy Optimization - An Empirical Study on Continuous Control 4
Regularized Inverse Reinforcement Learning 4
Reinforcement Learning with Random Delays 3
Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models 2
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting 5
Removing Undesirable Feature Contributions Using Out-of-Distribution Data 4
Representation Balancing Offline Model-based Reinforcement Learning 6
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components 4
Representation Learning via Invariant Causal Mechanisms 1
Representation learning for improved interpretability and classification accuracy of clinical factors from EEG 2
Representing Partial Programs with Blended Abstract Semantics 2
Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning 5
ResNet After All: Neural ODEs and Their Numerical Solution 5
Reset-Free Lifelong Learning with Skill-Space Planning 4
Rethinking Architecture Selection in Differentiable NAS 4
Rethinking Attention with Performers 6
Rethinking Embedding Coupling in Pre-trained Language Models 5
Rethinking Positional Encoding in Language Pre-training 5
Rethinking Soft Labels for Knowledge Distillation: A Bias–Variance Tradeoff Perspective 4
Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability 2
Retrieval-Augmented Generation for Code Summarization via Hybrid GNN 5
Return-Based Contrastive Representation Learning for Reinforcement Learning 3
Revisiting Dynamic Convolution via Matrix Decomposition 5
Revisiting Few-sample BERT Fine-tuning 5
Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction 5
Revisiting Locally Supervised Learning: an Alternative to End-to-end Training 5
Reweighting Augmented Samples by Minimizing the Maximal Expected Loss 5
Ringing ReLUs: Harmonic Distortion Analysis of Nonlinear Feedforward Networks 5
Risk-Averse Offline Reinforcement Learning 5
Robust Curriculum Learning: from clean label detection to noisy label self-correction 3
Robust Learning of Fixed-Structure Bayesian Networks in Nearly-Linear Time 1
Robust Overfitting may be mitigated by properly learned smoothening 5
Robust Pruning at Initialization 4
Robust Reinforcement Learning on State Observations with Learned Optimal Adversary 4
Robust and Generalizable Visual Representation Learning via Random Convolutions 5
Robust early-learning: Hindering the memorization of noisy labels 7
SAFENet: A Secure, Accurate and Fast Neural Network Inference 4
SALD: Sign Agnostic Learning with Derivatives 5
SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing 5
SEDONA: Search for Decoupled Neural Networks toward Greedy Block-wise Learning 6
SEED: Self-supervised Distillation For Visual Representation 5
SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments 2
SOLAR: Sparse Orthogonal Learned and Random Embeddings 5
SSD: A Unified Framework for Self-Supervised Outlier Detection 5
Saliency is a Possible Red Herring When Diagnosing Poor Generalization 4
SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization 5
Sample-Efficient Automated Deep Reinforcement Learning 4
Scalable Bayesian Inverse Reinforcement Learning 5
Scalable Learning and MAP Inference for Nonsymmetric Determinantal Point Processes 5
Scalable Transfer Learning with Expert Models 3
Scaling Symbolic Methods using Gradients for Neural Model Explanation 4
Scaling the Convex Barrier with Active Sets 5
Score-Based Generative Modeling through Stochastic Differential Equations 4
Selective Classification Can Magnify Disparities Across Groups 5
Selectivity considered harmful: evaluating the causal impact of class selectivity in DNNs 3
Self-Supervised Learning of Compressed Video Representations 5
Self-Supervised Policy Adaptation during Deployment 3
Self-supervised Adversarial Robustness for the Low-label, High-data Regime 4
Self-supervised Learning from a Multi-view Perspective 4
Self-supervised Representation Learning with Relative Predictive Coding 4
Self-supervised Visual Reinforcement Learning with Object-centric Representations 2
Self-training For Few-shot Transfer Across Extreme Task Differences 4
Semantic Re-tuning with Contrastive Tension 4
Semi-supervised Keypoint Localization 3
SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness 4
Separation and Concentration in Deep Networks 3
Seq2Tens: An Efficient Representation of Sequences by Low-Rank Tensor Projections 4
Sequential Density Ratio Estimation for Simultaneous Optimization of Speed and Accuracy 6
Set Prediction without Imposing Structure as Conditional Density Estimation 3
Shape or Texture: Understanding Discriminative Features in CNNs 4
Shape-Texture Debiased Neural Network Training 4
Shapley Explanation Networks 5
Shapley explainability on the data manifold 2
Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation 4
Sharper Generalization Bounds for Learning with Gradient-dominated Objective Functions 4
Sharpness-aware Minimization for Efficiently Improving Generalization 6
Signatory: differentiable computations of the signature and logsignature transforms, on both CPU and GPU 3
Simple Augmentation Goes a Long Way: ADRL for DNN Quantization 4
Simple Spectral Graph Convolution 4
Single-Photon Image Classification 2
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy 1
SkipW: Resource Adaptable RNN with Strict Upper Computational Limit 4
Sliced Kernelized Stein Discrepancy 3
Solving Compositional Reinforcement Learning Problems via Task Reduction 3
Sparse Quantized Spectral Clustering 2
Sparse encoding for more-interpretable feature-selecting representations in probabilistic matrix factorization 3
Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling 5
Spatially Structured Recurrent Modules 5
Spatio-Temporal Graph Scattering Transform 3
Stabilized Medical Image Attacks 4
Statistical inference for individual fairness 3
Stochastic Security: Adversarial Defense Using Long-Run Dynamics of Energy-Based Models 5
Structured Prediction as Translation between Augmented Natural Languages 5
Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning 3
Support-set bottlenecks for video-text representation learning 4
Symmetry-Aware Actor-Critic for 3D Molecular Design 5
Systematic generalisation with group invariant predictions 4
Taking Notes on the Fly Helps Language Pre-Training 5
Taming GANs with Lookahead-Minmax 4
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits 5
Task-Agnostic Morphology Evolution 3
Teaching Temporal Logics to Neural Networks 3
Teaching with Commentaries 5
Temporally-Extended ε-Greedy Exploration 4
Tent: Fully Test-Time Adaptation by Entropy Minimization 4
Text Generation by Learning from Demonstrations 6
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers 4
The Importance of Pessimism in Fixed-Dataset Policy Optimization 4
The Intrinsic Dimension of Images and Its Impact on Learning 2
The Recurrent Neural Tangent Kernel 3
The Risks of Invariant Risk Minimization 1
The Role of Momentum Parameters in the Optimal Convergence of Adaptive Polyak's Heavy-ball Methods 4
The Traveling Observer Model: Multi-task Learning Through Spatial Variable Embeddings 4
The Unreasonable Effectiveness of Patches in Deep Convolutional Kernels Methods 3
The geometry of integration in text classification RNNs 3
The inductive bias of ReLU networks on orthogonally separable data 2
The role of Disentanglement in Generalisation 3
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data 2
Theoretical bounds on estimation error for meta-learning 2
Tilted Empirical Risk Minimization 5
Tomographic Auto-Encoder: Unsupervised Bayesian Recovery of Corrupted Data 4
Topology-Aware Segmentation Using Discrete Morse Theory 3
Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis 4
Towards Impartial Multi-task Learning 5
Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding 3
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning 2
Towards Robust Neural Networks via Close-loop Control 4
Towards Robustness Against Natural Language Word Substitutions 5
Tradeoffs in Data Augmentation: An Empirical Study 4
Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs 3
Training GANs with Stronger Augmentations via Contrastive Discriminator 5
Training independent subnetworks for robust prediction 5
Training with Quantization Noise for Extreme Model Compression 4
Trajectory Prediction using Equivariant Continuous Convolution 4
Transformer protein language models are unsupervised structure learners 5
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning 3
TropEx: An Algorithm for Extracting Linear Terms in Deep Neural Networks 4
Trusted Multi-View Classification 4
UMEC: Unified model and embedding compression for efficient recommendation systems 6
UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers 3
Unbiased Teacher for Semi-Supervised Object Detection 4
Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs 5
Uncertainty Estimation in Autoregressive Structured Prediction 4
Uncertainty Sets for Image Classifiers using Conformal Prediction 5
Uncertainty in Gradient Boosting via Ensembles 4
Uncertainty-aware Active Learning for Optimal Bayesian Classifier 4
Understanding Over-parameterization in Generative Adversarial Networks 2
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning 4
Understanding and Improving Lexical Choice in Non-Autoregressive Translation 4
Understanding the effects of data parallelism and sparsity on neural network training 5
Understanding the failure modes of out-of-distribution generalization 3
Understanding the role of importance weighting for deep learning 1
Undistillable: Making A Nasty Teacher That CANNOT teach students 3
Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning 5
Universal approximation power of deep residual neural networks via nonlinear control theory 0
Unlearnable Examples: Making Personal Data Unexploitable 4
Unsupervised Audiovisual Synthesis via Exemplar Autoencoders 3
Unsupervised Discovery of 3D Physical Objects from Video 2
Unsupervised Meta-Learning through Latent-Space Interpolation in Generative Models 5
Unsupervised Object Keypoint Learning using Local Spatial Predictability 4
Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding 4
Usable Information and Evolution of Optimal Representations During Training 3
Using latent space regression to analyze and leverage compositionality in GANs 3
VA-RED$^2$: Video Adaptive Redundancy Reduction 6
VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models 4
VCNet and Functional Targeted Regularization For Learning Causal Effects of Continuous Treatments 4
VTNet: Visual Transformer Network for Object Goal Navigation 3
Variational Information Bottleneck for Effective Low-Resource Fine-Tuning 5
Variational Intrinsic Control Revisited 2
Variational State-Space Models for Localisation and Dense 3D Mapping in 6 DoF 5
Vector-output ReLU Neural Network Problems are Copositive Programs: Convex Analysis of Two Layer Networks and Polynomial-time Algorithms 5
Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images 4
Viewmaker Networks: Learning Views for Unsupervised Representation Learning 5
Vulnerability-Aware Poisoning Mechanism for Online RL with Unknown Dynamics 4
WaNet - Imperceptible Warping-based Backdoor Attack 5
Wandering within a world: Online contextualized few-shot learning 4
Wasserstein Embedding for Graph Learning 6
Wasserstein-2 Generative Networks 5
Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration 3
WaveGrad: Estimating Gradients for Waveform Generation 5
What Can You Learn From Your Muscles? Learning Visual Representation from Human Interactions 3
What Makes Instance Discrimination Good for Transfer Learning? 3
What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study 4
What Should Not Be Contrastive in Contrastive Learning 3
What are the Statistical Limits of Offline RL with Linear Function Approximation? 1
What they do when in doubt: a study of inductive biases in seq2seq learners 3
When Do Curricula Work? 6
When Optimizing $f$-Divergence is Robust with Label Noise 4
When does preconditioning help or hurt generalization? 2
Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets? 3
Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients 2
Winning the L2RPN Challenge: Power Grid Management via Semi-Markov Afterstate Actor-Critic 4
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching 6
WrapNet: Neural Net Inference with Ultra-Low-Precision Arithmetic 4
X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback 3
You Only Need Adversarial Supervision for Semantic Image Synthesis 5
Zero-Cost Proxies for Lightweight NAS 5
Zero-shot Synthesis with Group-Supervised Learning 3
gradSim: Differentiable simulation for system identification and visuomotor control 5
not-MIWAE: Deep Generative Modelling with Missing not at Random Data 3