International Conference on Learning Representations (ICLR) - 2021

Conference Proceedings:

Key: PC - Pseudocode, OSC - Open Source Code, OSD - Open Datasets, DS - Dataset Splits, HS - Hardware Specification, SD - Software Dependencies, ES - Experiment Setup

$i$-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning 5
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning 4
A Block Minifloat Representation for Training Deep Neural Networks 4
A Critique of Self-Expressive Deep Subspace Clustering 2
A Design Space Study for LISTA and Beyond 5
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima 2
A Discriminative Gaussian Mixture Model with Sparsity 3
A Distributional Approach to Controlled Text Generation 6
A Geometric Analysis of Deep Generative Image Models and Its Applications 3
A Good Image Generator Is What You Need for High-Resolution Video Synthesis 5
A Gradient Flow Framework For Analyzing Network Pruning 2
A Hypergradient Approach to Robust Regression without Correspondence 4
A Learning Theoretic Perspective on Local Explainability 2
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks 4
A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks 1
A Panda? No, It's a Sloth: Slowdown Attacks on Adaptive Multi-Exit Neural Network Inference 4
A Temporal Kernel Approach for Deep Learning with Continuous-time Information 6
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention 5
A Unified Approach to Interpreting and Boosting Adversarial Transferability 6
A Universal Representation Transformer Layer for Few-Shot Image Classification 6
A Wigner-Eckart Theorem for Group Equivariant Convolution Kernels 0
A statistical theory of cold posteriors in deep neural networks 2
A teacher-student framework to distill future trajectories 3
A unifying view on implicit bias in training linear neural networks 1
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning 4
ANOCE: Analysis of Causal Effects with Multiple Mediators via Constrained Structural Learning 5
ARMOURED: Adversarially Robust MOdels using Unlabeled data by REgularizing Diversity 4
AUXILIARY TASK UPDATE DECOMPOSITION: THE GOOD, THE BAD AND THE NEUTRAL 5
Accelerating Convergence of Replica Exchange Stochastic Gradient MCMC via Variance Reduction 4
Accurate Learning of Graph Representations with Graph Multiset Pooling 4
Achieving Linear Speedup with Partial Worker Participation in Non-IID Federated Learning 4
Acting in Delayed Environments with Non-Stationary Markov Policies 4
Activation-level uncertainty in deep neural networks 5
Active Contrastive Learning of Audio-Visual Video Representations 6
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition 5
AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models 5
AdaSpeech: Adaptive Text to Speech for Custom Voice 4
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights 6
Adapting to Reward Progressivity via Spectral Reinforcement Learning 4
Adaptive Extra-Gradient Methods for Min-Max Optimization and Games 2
Adaptive Federated Optimization 5
Adaptive Procedural Task Generation for Hard-Exploration Problems 5
Adaptive Universal Generalized PageRank Graph Neural Network 5
Adaptive and Generative Zero-Shot Learning 3
Adversarial score matching and improved sampling for image generation 5
Adversarially Guided Actor-Critic 3
Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification 3
Aligning AI With Shared Human Values 3
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 5
An Unsupervised Deep Learning Approach for Real-World Image Denoising 4
Analyzing the Expressive Power of Graph Neural Networks in a Spectral Perspective 4
Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics 2
Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies 6
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval 5
Anytime Sampling for Autoregressive Models via Ordered Autoencoding 4
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval 5
Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks 4
Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees? 3
Are wider nets better given the same number of parameters? 4
Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning 4
Async-RED: A Provably Convergent Asynchronous Block Parallel Stochastic Method using Deep Denoising Priors 3
Attentional Constellation Nets for Few-Shot Learning 4
Auction Learning as a Two-Player Game 2
Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting 3
Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation 6
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly 6
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization 4
Autoregressive Entity Retrieval 4
Auxiliary Learning by Implicit Differentiation 6
Average-case Acceleration for Bilinear Games and Normal Matrices 1
BERTology Meets Biology: Interpreting Attention in Protein Language Models 5
BOIL: Towards Representation Change for Few-shot Learning 3
BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction 5
BREEDS: Benchmarks for Subpopulation Shift 5
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization 4
BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration 4
Bag of Tricks for Adversarial Training 4
Balancing Constraints and Rewards with Meta-Gradient D4PG 4
Batch Reinforcement Learning Through Continuation Method 3
Bayesian Context Aggregation for Neural Processes 3
Bayesian Few-Shot Classification with One-vs-Each Pólya-Gamma Augmented Gaussian Processes 5
Behavioral Cloning from Noisy Demonstrations 3
Benchmarks for Deep Off-Policy Evaluation 3
Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods 0
Better Fine-Tuning by Reducing Representational Collapse 3
Beyond Categorical Label Representations for Image Classification 5
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters 3
BiPointNet: Binary Neural Network for Point Clouds 5
Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech 7
Blending MPC & Value Function Approximation for Efficient Reinforcement Learning 2
Boost then Convolve: Gradient Boosting Meets Graph Neural Networks 5
Bowtie Networks: Generative Modeling for Joint Few-Shot Recognition and Novel-View Synthesis 4
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification 4
Byzantine-Resilient Non-Convex Stochastic Gradient Descent 3
C-Learning: Horizon-Aware Cumulative Accessibility Estimation 4
C-Learning: Learning to Achieve Goals via Recursive Classification 5
CO2: Consistent Contrast for Unsupervised Visual Representation Learning 3
CPR: Classifier-Projection Regularization for Continual Learning 2
CPT: Efficient Deep Neural Network Training via Cyclic Precision 5
CT-Net: Channel Tensorization Network for Video Classification 3
CaPC Learning: Confidential and Private Collaborative Learning 3
Calibration of Neural Networks using Splines 4
Calibration tests beyond classification 5
Can a Fruit Fly Learn Word Embeddings? 4
Capturing Label Characteristics in VAEs 3
Categorical Normalizing Flows via Continuous Transformations 7
CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning 3
CcGAN: Continuous Conditional Generative Adversarial Networks for Image Generation 4
Certify or Predict: Boosting Certified Robustness with Compositional Architectures 5
Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions 1
Characterizing signal propagation to close the performance gap in unnormalized ResNets 5
ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations 6
Clairvoyance: A Pipeline Toolkit for Medical Time Series 6
Class Normalization for (Continual)? Generalized Zero-Shot Learning 5
Clustering-friendly Representation Learning via Instance Discrimination and Feature Decorrelation 3
Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity 5
CoCo: Controllable Counterfactuals for Evaluating Dialogue State Trackers 4
CoCon: A Self-Supervised Approach for Controlled Text Generation 5
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding 3
Collective Robustness Certificates: Exploiting Interdependence in Graph Neural Networks 5
Colorization Transformer 5
Combining Ensembles and Data Augmentation Can Harm Your Calibration 5
Combining Label Propagation and Simple Models out-performs Graph Neural Networks 3
Combining Physics and Machine Learning for Network Flow Estimation 6
Communication in Multi-Agent Reinforcement Learning: Intention Sharing 2
CompOFA – Compound Once-For-All Networks for Faster Multi-Platform Deployment 5
Complex Query Answering with Neural Link Predictors 4
Computational Separation Between Convolutional and Fully-Connected Networks 2
Concept Learners for Few-Shot Learning 4
Conditional Generative Modeling via Learning the Latent Space 4
Conditional Negative Sampling for Contrastive Learning of Visual Representations 5
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data 6
Conformation-Guided Molecular Representation with Hamiltonian Neural Networks 4
Conservative Safety Critics for Exploration 2
Contemplating Real-World Object Classification 2
Contextual Dropout: An Efficient Sample-Dependent Dropout Module 5
Contextual Transformation Networks for Online Continual Learning 7
Continual learning in recurrent neural networks 2
Continuous Wasserstein-2 Barycenter Estimation without Minimax Optimization 5
Contrastive Learning with Adversarial Perturbations for Conditional Text Generation 3
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning 4
Contrastive Divergence Learning is a Time Reversal Adversarial Game 3
Contrastive Explanations for Reinforcement Learning via Embedded Self Predictions 4
Contrastive Learning with Hard Negative Samples 5
Contrastive Syn-to-Real Generalization 4
Control-Aware Representations for Model-based Reinforcement Learning 3
Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization 3
Convex Regularization behind Neural Reconstruction 3
Coping with Label Shift via Distributionally Robust Optimisation 4
CopulaGNN: Towards Integrating Representational and Correlational Roles of Graphs in Graph Neural Networks 5
Correcting experience replay for multi-agent communication 3
Counterfactual Generative Networks 4
Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies 4
Creative Sketch Generation 5
Cross-Attentional Audio-Visual Fusion for Weakly-Supervised Action Localization 3
Cut out the annotator, keep the cutout: better segmentation with weak supervision 3
DARTS-: Robustly Stepping out of Performance Collapse Without Indicators 6
DC3: A learning method for optimization with hard constraints 6
DDPNOpt: Differential Dynamic Programming Neural Optimizer 4
DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION 6
DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation 4
DINO: A Conditional Energy-Based GAN for Domain Translation 4
DOP: Off-Policy Multi-Agent Decomposed Policy Gradients 4
Data-Efficient Reinforcement Learning with Self-Predictive Representations 5
Dataset Condensation with Gradient Matching 6
Dataset Inference: Ownership Resolution in Machine Learning 4
Dataset Meta-Learning from Kernel Ridge-Regression 5
DeLighT: Deep and Light-weight Transformer 6
Debiasing Concept-based Explanations with Causal Analysis 4
Decentralized Attribution of Generative Models 5
Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach 4
Deconstructing the Regularization of BatchNorm 3
Decoupling Global and Local Representations via Invertible Generative Flows 4
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation 6
Deep Equals Shallow for ReLU Networks in Kernel Regimes 4
Deep Learning meets Projective Clustering 5
Deep Networks and the Multiple Manifold Problem 0
Deep Neural Network Fingerprinting by Conferrable Adversarial Examples 4
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS 0
Deep Partition Aggregation: Provable Defenses against General Poisoning Attacks 4
Deep Repulsive Clustering of Ordered Data Based on Order-Identity Decomposition 4
Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients 5
DeepAveragers: Offline Reinforcement Learning By Solving Derived Non-Parametric MDPs 4
Deformable DETR: Deformable Transformers for End-to-End Object Detection 5
Degree-Quant: Quantization-Aware Training for Graph Neural Networks 7
Denoising Diffusion Implicit Models 3
Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization 6
DialoGraph: Incorporating Interpretable Strategy-Graph Networks into Negotiation Dialogues 5
DiffWave: A Versatile Diffusion Model for Audio Synthesis 4
Differentiable Segmentation of Sequences 4
Differentiable Trust Region Layers for Deep Reinforcement Learning 5
Differentially Private Learning Needs Better Features (or Much More Data) 6
Directed Acyclic Graph Neural Networks 4
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate 2
Disambiguating Symbolic Expressions in Informal Documents 4
Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization 4
Discovering Non-monotonic Autoregressive Orderings with Variational Inference 6
Discovering a set of policies for the worst case reward 3
Discrete Graph Structure Learning for Forecasting Multiple Time Series 5
Disentangled Recurrent Wasserstein Autoencoder 4
Disentangling 3D Prototypical Networks for Few-Shot Concept Learning 4
Distance-Based Regularisation of Deep Networks for Fine-Tuning 4
Distilling Knowledge from Reader to Retriever for Question Answering 4
Distributed Momentum for Byzantine-resilient Stochastic Gradient Descent 5
Distributional Sliced-Wasserstein and Applications to Generative Modeling 3
Diverse Video Generation using a Gaussian Process Trigger 2
Do 2D GANs Know 3D Shape? Unsupervised 3D Shape Reconstruction from 2D Image GANs 3
Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth 2
Do not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning 6
Does enhanced shape bias improve neural network robustness to common corruptions? 2
Domain Generalization with MixStyle 5
Domain-Robust Visual Imitation Learning with Mutual Information Constraints 4
DrNAS: Dirichlet Neural Architecture Search 5
Drop-Bottleneck: Learning Discrete Compressed Representation for Noise-Robust Exploration 2
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling 3
DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation 4
Dynamic Tensor Rematerialization 5
ECONOMIC HYPERPARAMETER OPTIMIZATION WITH BLENDED SEARCH STRATEGY 7
EEC: Learning to Encode and Regenerate Images for Continual Learning 5
EVALUATION OF NEURAL ARCHITECTURES TRAINED WITH SQUARE LOSS VS CROSS-ENTROPY IN CLASSIFICATION TASKS 4
Early Stopping in Deep Networks: Double Descent and How to Eliminate it 3
Effective Abstract Reasoning with Dual-Contrast Network 5
Effective Distributed Learning with Random Features: Improved Bounds and Algorithms 4
Effective and Efficient Vote Attack on Capsule Networks 4
Efficient Certified Defenses Against Patch Attacks on Image Classifiers 4
Efficient Conformal Prediction via Cascaded Inference with Expanded Admission 5
Efficient Continual Learning with Modular Networks and Task-Driven Priors 6
Efficient Empowerment Estimation for Unsupervised Stabilization 4
Efficient Generalized Spherical CNNs 2
Efficient Inference of Flexible Interaction in Spiking-neuron Networks 4
Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL 1
Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation 3
Efficient Wasserstein Natural Gradients for Reinforcement Learning 4
EigenGame: PCA as a Nash Equilibrium 4
Emergent Road Rules In Multi-Agent Driving Environments 3
Emergent Symbols through Binding in External Memory 3
Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition 4
Empirical or Invariant Risk Minimization? A Sample Complexity Perspective 4
End-to-End Egospheric Spatial Memory 7
End-to-end Adversarial Text-to-Speech 5
Enforcing robust control guarantees within neural network policies 5
Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation 3
Entropic gradient descent algorithms and wide flat minima 5
Estimating Lipschitz constants of monotone deep equilibrium models 3
Estimating and Evaluating Regression Predictive Uncertainty in Deep Object Detectors 5
Estimating informativeness of samples with Smooth Unique Information 4
Evaluating the Disentanglement of Deep Generative Models through Manifold Topology 4
Evaluation of Similarity-based Explanations 4
Evaluations and Methods for Explanation through Robustness Analysis 5
Evolving Reinforcement Learning Algorithms 4
Exemplary Natural Images Explain CNN Activations Better than State-of-the-Art Feature Visualization 6
Explainable Deep One-Class Classification 6
Explainable Subgraph Reasoning for Forecasting on Temporal Knowledge Graphs 5
Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning 3
Explaining the Efficacy of Counterfactually Augmented Data 3
Exploring Balanced Feature Spaces for Representation Learning 4
Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit 5
Expressive Power of Invariant and Equivariant Graph Neural Networks 4
Extracting Strong Policies for Robotics Tasks from Zero-Order Trajectory Optimizers 4
Extreme Memorization via Scale of Initialization 4
FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization 5
Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments 4
Fair Mixup: Fairness via Interpolation 4
FairBatch: Batch Selection for Model Fairness 5
FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders 4
Fantastic Four: Differentiable and Efficient Bounds on Singular Values of Convolution Layers 5
Fast And Slow Learning Of Recurrent Independent Mechanisms 2
Fast Geometric Projections for Local Robustness Certification 5
Fast and Complete: Enabling Complete Neural Network Verification with Rapid and Massively Parallel Incomplete Verifiers 5
Fast convergence of stochastic subgradient method under interpolation 3
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech 4
Faster Binary Embeddings for Preserving Euclidean Distances 4
FedBE: Making Bayesian Model Ensemble Applicable to Federated Learning 5
FedBN: Federated Learning on Non-IID Features via Local Batch Normalization 4
FedMix: Approximation of Mixup under Mean Augmented Federated Learning 4
Federated Learning Based on Dynamic Regularization 3
Federated Learning via Posterior Averaging: A New Perspective and Practical Algorithms 4
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning 5
Few-Shot Bayesian Optimization with Deep Kernel Surrogates 4
Few-Shot Learning via Learning the Representation, Provably 0
Fidelity-based Deep Adiabatic Scheduling 1
Filtered Inner Product Projection for Crosslingual Embedding Alignment 5
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis 6
Fooling a Complete Neural Network Verifier 5
For self-supervised learning, Rationality implies generalization, provably 3
Fourier Neural Operator for Parametric Partial Differential Equations 2
Free Lunch for Few-shot Learning: Distribution Calibration 5
Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders 4
Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online 5
GAN "Steerability" without optimization 1
GAN2GAN: Generative Noise Learning for Blind Denoising with Single Noisy Images 5
GANs Can Play Lottery Tickets Too 3
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding 4
Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs 3
Generalization bounds via distillation 2
Generalization in data-driven models of primary visual cortex 4
Generalized Energy Based Models 5
Generalized Multimodal ELBO 4
Generalized Variational Continual Learning 4
Generating Adversarial Computer Programs using Optimized Obfuscations 3
Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains 3
Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule 5
Generative Scene Graph Networks 4
Generative Time-series Modeling with Fourier Flows 2
Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning 3
Geometry-Aware Gradient Algorithms for Neural Architecture Search 6
Geometry-aware Instance-reweighted Adversarial Training 4
Getting a CLUE: A Method for Explaining Uncertainty Estimates 5
Global Convergence of Three-layer Neural Networks in the Mean Field Regime 0
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime 1
Go with the flow: Adaptive control for Neural ODEs 5
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing 5
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability 2
Gradient Origin Networks 5
Gradient Projection Memory for Continual Learning 6
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models 4
Graph Coarsening with Neural Networks 5
Graph Convolution with Low-rank Learnable Local Filters 5
Graph Edit Networks 6
Graph Information Bottleneck for Subgraph Recognition 4
Graph Traversal with Tensor Functionals: A Meta-Algorithm for Scalable Learning 6
Graph-Based Continual Learning 2
GraphCodeBERT: Pre-training Code Representations with Data Flow 5
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity 2
Grounded Language Learning Fast and Slow 2
Grounding Language to Autonomously-Acquired Skills via Goal Generation 3
Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning 3
Group Equivariant Conditional Neural Processes 4
Group Equivariant Generative Adversarial Networks 4
Group Equivariant Stand-Alone Self-Attention For Vision 4
Growing Efficient Deep Networks by Structured Continuous Sparsification 4
HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark 4
HalentNet: Multimodal Trajectory Forecasting with Hallucinative Intents 6
Heating up decision boundaries: isocapacitory saturation, adversarial scenarios and generalization bounds 7
HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients 3
Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization 6
Hierarchical Autoregressive Modeling for Neural Video Compression 4
Hierarchical Reinforcement Learning by Discovering Intrinsic Options 4
High-Capacity Expert Binary Networks 4
Hopfield Networks is All You Need 5
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning 3
How Benign is Benign Overfitting ? 2
How Does Mixup Help With Robustness and Generalization? 3
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks? 2
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks 2
How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision 5
Human-Level Performance in No-Press Diplomacy via Equilibrium Search 2
HyperDynamics: Meta-Learning Object and Agent Dynamics with Hypernetworks 2
HyperGrid Transformers: Towards A Single Model for Multiple Tasks 4
Hyperbolic Neural Networks++ 3
IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression 4
IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning 4
INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving 6
IOT: Instance-wise Layer Reordering for Transformer Structures 5
Identifying Physical Law of Hamiltonian Systems via Meta-Learning 4
Identifying nonlinear dynamical systems with multiple time scales and long-range dependencies 3
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels 6
Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering 3
Impact of Representation Learning in Linear Bandits 3
Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time 2
Implicit Gradient Regularization 2
Implicit Normalizing Flows 6
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning 4
Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors 4
Improved Autoregressive Modeling with Distribution Smoothing 2
Improved Estimation of Concentration Under $\ell_p$-Norm Distance Metrics Using Half Spaces 4
Improving Adversarial Robustness via Channel-wise Activation Suppressing 4
Improving Relational Regularized Autoencoders with Spherical Sliced Fused Gromov Wasserstein 4
Improving Transformation Invariance in Contrastive Representation Learning 5
Improving VAEs' Robustness to Adversarial Attack 2
Improving Zero-Shot Voice Style Transfer via Disentangled Representation Learning 4
In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning 4
In Search of Lost Domain Generalization 5
In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness 5
Incorporating Symmetry into Deep Dynamics Models for Improved Generalization 5
Incremental few-shot learning via vector quantization in deep embedded space 4
Individually Fair Gradient Boosting 5
Individually Fair Rankings 4
Inductive Representation Learning in Temporal Networks via Causal Anonymous Walks 6
Influence Estimation for Generative Adversarial Networks 4
Influence Functions in Deep Learning Are Fragile 2
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective 6
Information Laundering for Model Privacy 4
Initialization and Regularization of Factorized Neural Layers 3
Integrating Categorical Semantics into Unsupervised Domain Translation 4
Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling 5
Interpretable Models for Granger Causality Using Self-explaining Neural Networks 6
Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels 6
Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking 5
Interpreting Knowledge Graph Relation Representation from Word Embeddings 3
Interpreting and Boosting Dropout from a Game-Theoretic View 3
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds 5
Intraclass clustering: an implicit learning ability that regularizes DNNs 2
Intrinsic-Extrinsic Convolution and Pooling for Learning on 3D Protein Structures 4
Is Attention Better Than Matrix Decomposition? 6
Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study 4
IsarStep: a Benchmark for High-level Mathematical Reasoning 5
Isometric Propagation Network for Generalized Zero-shot Learning 3
Isometric Transformation Invariant and Equivariant Graph Convolutional Networks 6
Isotropy in the Contextual Embedding Space: Clusters and Manifolds 2
Iterated learning for emergent systematicity in VQA 5
Iterative Empirical Game Solving via Single Policy Best Response 3
Kanerva++: Extending the Kanerva Machine With Differentiable, Locally Block Allocated Latent Memory 3
Knowledge Distillation as Semiparametric Inference 4
Knowledge distillation via softmax regression representation learning 4
LEAF: A Learnable Frontend for Audio Classification 4
LambdaNetworks: Modeling long-range Interactions without Attention 5
Language-Agnostic Representation Learning of Source Code from Structure and Context 4
Large Associative Memory Problem in Neurobiology and Machine Learning 0
Large Batch Simulation for Deep Reinforcement Learning 5
Large Scale Image Completion via Co-Modulated Generative Adversarial Networks 5
Large-width functional asymptotics for deep Gaussian neural networks 0
Latent Convergent Cross Mapping 4
Latent Skill Planning for Exploration and Transfer 3
Layer-adaptive Sparsity for the Magnitude-based Pruning 4
Learnable Embedding sizes for Recommender Systems 6
Learning "What-if" Explanations for Sequential Decision-Making 5
Learning A Minimax Optimizer: A Pilot Study 5
Learning Accurate Entropy Model with Global Reference for Image Compression 2
Learning Associative Inference Using Fast Weight Memory 6
Learning Better Structured Representations Using Low-rank Adaptive Label Smoothing 4
Learning Cross-Domain Correspondence for Control with Dynamics Cycle-Consistency 3
Learning Deep Features in Instrumental Variable Regression 6
Learning Energy-Based Generative Models via Coarse-to-Fine Expanding and Sampling 4
Learning Energy-Based Models by Diffusion Recovery Likelihood 4
Learning Generalizable Visual Representations via Interactive Gameplay 5
Learning Hyperbolic Representations of Topological Features 5
Learning Incompressible Fluid Dynamics from Scratch - Towards Fast, Differentiable Fluid Models that Generalize 3
Learning Invariant Representations for Reinforcement Learning without Reconstruction 4
Learning Long-term Visual Dynamics with Region Proposal Interaction Networks 5
Learning Manifold Patch-Based Representations of Man-Made Shapes 4
Learning Mesh-Based Simulation with Graph Networks 4
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch 6
Learning Neural Event Functions for Ordinary Differential Equations 6
Learning Neural Generative Dynamics for Molecular Conformation Generation 6
Learning Parametrised Graph Shift Operators 3
Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues 5
Learning Robust State Abstractions for Hidden-Parameter Block MDPs 4
Learning Safe Multi-agent Control with Decentralized Neural Barrier Certificates 2
Learning Structural Edits via Incremental Tree Transformations 5
Learning Subgoal Representations with Slow Dynamics 4
Learning Task Decomposition with Ordered Memory Policy Network 3
Learning Task-General Representations with Generative Neuro-Symbolic Modeling 3
Learning Value Functions in Deep Policy Gradients using Residual Variance 3
Learning What To Do by Simulating the Past 3
Learning a Latent Search Space for Routing Problems using Variational Autoencoders 5
Learning a Latent Simplex in Input Sparsity Time 4
Learning advanced mathematical computations from examples 4
Learning and Evaluating Representations for Deep One-Class Classification 4
Learning continuous-time PDEs from sparse data with graph neural networks 2
Learning explanations that are hard to vary 5
Learning from Demonstration with Weakly Supervised Disentanglement 4
Learning from Protein Structure with Geometric Vector Perceptrons 7
Learning from others' mistakes: Avoiding dataset biases without modeling them 4
Learning perturbation sets for robust machine learning 6
Learning the Pareto Front with Hypernetworks 6
Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation 4
Learning to Generate 3D Shapes with Generative Cellular Automata 4
Learning to Make Decisions via Submodular Regularization 5
Learning to Reach Goals via Iterated Supervised Learning 4
Learning to Recombine and Resample Data For Compositional Generalization 6
Learning to Represent Action Values as a Hypergraph on the Action Vertices 3
Learning to Sample with Local and Global Contexts in Experience Replay Buffer 4
Learning to Set Waypoints for Audio-Visual Navigation 3
Learning to live with Dale's principle: ANNs with separate excitatory and inhibitory units 5
Learning with AMIGo: Adversarially Motivated Intrinsic Goals 3
Learning with Feature-Dependent Label Noise: A Progressive Approach 5
Learning with Instance-Dependent Label Noise: A Sample Sieve Approach 5
Learning-based Support Estimation in Sublinear Time 3
Lifelong Learning of Compositional Structures 5
LiftPool: Bidirectional ConvNet Pooling 4
Linear Convergent Decentralized Optimization with Compression 3
Linear Last-iterate Convergence in Constrained Saddle-point Optimization 1
Linear Mode Connectivity in Multitask and Continual Learning 5
Lipschitz Recurrent Neural Networks 4
Local Convergence Analysis of Gradient Descent Ascent with Finite Timescale Separation 3
Local Search Algorithms for Rank-Constrained Convex Optimization 5
Locally Free Weight Sharing for Network Width Search 5
Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning 6
Long Range Arena : A Benchmark for Efficient Transformers 4
Long-tail learning via logit adjustment 2
Long-tailed Recognition by Routing Diverse Distribution-Aware Experts 5
Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search 5
Lossless Compression of Structured Convolutional Models via Lifting 3
LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition 5
MALI: A memory efficient and reverse accurate integrator for Neural ODEs 5
MARS: Markov Molecular Sampling for Multi-objective Drug Discovery 5
MELR: Meta-Learning via Modeling Episode-Level Relationships for Few-Shot Learning 4
MIROSTAT: A NEURAL TEXT DECODING ALGORITHM THAT DIRECTLY CONTROLS PERPLEXITY 4
MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space 4
MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training 4
Mapping the Timescale Organization of Neural Language Models 3
Mastering Atari with Discrete World Models 5
Mathematical Reasoning via Self-supervised Skip-tree Training 4
Measuring Massive Multitask Language Understanding 4
Memory Optimization for Deep Networks 6
Meta Back-Translation 4
Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning 5
Meta-Learning of Structured Task Distributions in Humans and Machines 2
Meta-Learning with Neural Tangent Kernels 4
Meta-learning Symmetries by Reparameterization 6
Meta-learning with negative learning rates 2
MetaNorm: Learning to Normalize Few-Shot Batches Across Domains 6
MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering 5
Mind the Gap when Conditioning Amortised Inference in Sequential Latent-Variable Models 3
Mind the Pad -- CNNs Can Develop Blind Spots 2
Minimum Width for Universal Approximation 0
MixKD: Towards Efficient Distillation of Large-scale Language Models 4
Mixed-Features Vectors and Subspace Splitting 1
MoPro: Webly Supervised Learning with Momentum Prototypes 5
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond 3
Model Patching: Closing the Subgroup Performance Gap with Data Augmentation 5
Model-Based Offline Planning 5
Model-Based Visual Planning with Self-Supervised Functional Distances 3
Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose? 4
Modeling the Second Player in Distributionally Robust Optimization 4
Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System 5
Molecule Optimization by Explainable Evolution 5
Monotonic Kronecker-Factored Lattice 6
Monte-Carlo Planning and Learning with Language Action Value Estimates 4
More or Less: When and How to Build Convolutional Neural Network Ensembles 3
Multi-Class Uncertainty Calibration via Mutual Information Maximization-based Binning 5
Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks 5
Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network 4
Multi-Time Attention Networks for Irregularly Sampled Time Series 5
Multi-resolution modeling of a discrete stochastic process identifies causes of cancer 4
Multi-timescale Representation Learning in LSTM Language Models 5
MultiModalQA: complex question answering over text, tables and images 3
Multiplicative Filter Networks 3
Multiscale Score Matching for Out-of-Distribution Detection 4
Multivariate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows 6
Mutual Information State Intrinsic Control 4
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control 3
NAS-Bench-ASR: Reproducible Neural Architecture Search for Speech Recognition 5
NBDT: Neural-Backed Decision Tree 4
NOVAS: Non-convex Optimization via Adaptive Stochastic Search for End-to-end Learning and Control 6
NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation 5
Nearest Neighbor Machine Translation 3
Negative Data Augmentation 3
Net-DNF: Effective Deep Modeling of Tabular Data 6
Network Pruning That Matters: A Case Study on Retraining Variants 3
Neural Approximate Sufficient Statistics for Implicit Models 4
Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective 6
Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks 4
Neural Delay Differential Equations 4
Neural Jump Ordinary Differential Equations: Consistent Continuous-Time Prediction and Filtering 5
Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces 6
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics 3
Neural Networks for Learning Counterfactual G-Invariances from Single Environments 5
Neural ODE Processes 6
Neural Pruning via Growing Regularization 5
Neural Spatio-Temporal Point Processes 4
Neural Synthesis of Binaural Speech From Mono Audio 5
Neural Thompson Sampling 3
Neural Topic Model via Optimal Transport 4
Neural gradients are near-lognormal: improved quantized and sparse training 4
Neural networks with late-phase weights 6
Neural representation and generation for RNA secondary structures 4
Neurally Augmented ALISTA 4
New Bounds For Distributed Mean Estimation and Variance Reduction 1
No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks 3
No MCMC for me: Amortized sampling for fast and stable training of energy-based models 5
Noise against noise: stochastic label noise helps combat inherent label noise 5
Noise or Signal: The Role of Image Backgrounds in Object Recognition 3
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds 3
Nonseparable Symplectic Neural Networks 3
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning 3
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers 3
Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation 4
On Data-Augmentation and Consistency-Based Semi-Supervised Learning 1
On Dyadic Fairness: Exploring and Mitigating Bias in Graph Connections 4
On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning 5
On Graph Neural Networks versus Graph-Augmented MLPs 3
On InstaHide, Phase Retrieval, and Sparse Matrix Factorization 3
On Learning Universal Representations Across Languages 4
On Position Embeddings in BERT 3
On Self-Supervised Image Representations for GAN Evaluation 3
On Statistical Bias In Active Learning: How and When to Fix It 4
On the Bottleneck of Graph Neural Networks and its Practical Implications 4
On the Critical Role of Conventions in Adaptive Human-AI Collaboration 4
On the Curse of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis 0
On the Dynamics of Training Attention Models 3
On the Impossibility of Global Convergence in Multi-Loss Optimization 2
On the Origin of Implicit Regularization in Stochastic Gradient Descent 2
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines 4
On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers 3
On the Transfer of Disentangled Representations in Realistic Settings 3
On the Universality of Rotation Equivariant Point Cloud Networks 4
On the Universality of the Double Descent Peak in Ridgeless Regression 2
On the geometry of generalization and memorization in deep neural networks 2
On the mapping between Hopfield networks and Restricted Boltzmann Machines 2
On the role of planning in model-based deep reinforcement learning 4
One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks 3
Online Adversarial Purification based on Self-supervised Learning 5
Open Question Answering over Tables and Text 4
Optimal Conversion of Conventional Artificial Neural Networks to Spiking Neural Networks 4
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime 2
Optimal Regularization can Mitigate Double Descent 2
Optimism in Reinforcement Learning with Generalized Linear Function Approximation 1
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning 4
Orthogonalizing Convolutional Layers with the Cayley Transform 6
Overfitting for Fun and Profit: Instance-Adaptive Data Compression 4
Overparameterisation and worst-case generalisation: friend or foe? 3
PAC Confidence Predictions for Deep Neural Network Classifiers 3
PC2WF: 3D Wireframe Reconstruction from Raw Point Clouds 2
PDE-Driven Spatiotemporal Disentanglement 5
PMI-Masking: Principled masking of correlated spans 3
PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences 4
Parameter Efficient Multimodal Transformers for Video Representation Learning 3
Parameter-Based Value Functions 6
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning 3
Partitioned Learned Bloom Filters 4
Perceptual Adversarial Robustness: Defense Against Unseen Threat Models 6
Personalized Federated Learning with First Order Model Optimization 3
Physics-aware, probabilistic model order reduction with guaranteed stability 1
Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks 2
Planning from Pixels using Inverse Dynamics Models 4
PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics 4
PolarNet: Learning to Optimize Polar Keypoints for Keypoint Based Object Detection 5
Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial Examples 6
Practical Massively Parallel Monte-Carlo Tree Search Applied to Molecular Design 5
Practical Real Time Recurrent Learning with a Sparse Approximation 3
Pre-training Text-to-Text Transformers for Concept-centric Common Sense 5
Predicting Classification Accuracy When Adding New Unobserved Classes 6
Predicting Inductive Biases of Pre-Trained Models 3
Predicting Infectiousness for Proactive Contact Tracing 3
Prediction and generalisation over directed actions by grid cells 3
Primal Wasserstein Imitation Learning 5
Private Image Reconstruction from System Side Channels Using Generative Models 5
Private Post-GAN Boosting 4
Probabilistic Numeric Convolutional Neural Networks 3
Probing BERT in Hyperbolic Spaces 4
Progressive Skeletonization: Trimming more fat from a network at initialization 5
Projected Latent Markov Chain Monte Carlo: Conditional Sampling of Normalizing Flows 4
Property Controllable Variational Autoencoder via Invertible Mutual Dependence 4
Protecting DNNs from Theft using an Ensemble of Diverse Models 3
Prototypical Contrastive Learning of Unsupervised Representations 6
Prototypical Representation Learning for Relation Extraction 6
Provable Rich Observation Reinforcement Learning with Combinatorial Latent States 5
Provably robust classification of adversarial examples with detection 5
Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry 1
Pruning Neural Networks at Initialization: Why Are We Missing the Mark? 2
PseudoSeg: Designing Pseudo Labels for Semantic Segmentation 4
QPLEX: Duplex Dueling Multi-Agent Q-Learning 3
Quantifying Differences in Reward Functions 4
R-GAP: Recursive Gradient Attack on Privacy 5
RMSprop converges with proper hyper-parameter 4
RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs 5
RODE: Learning Roles to Decompose Multi-Agent Tasks 3
Random Feature Attention 5
Randomized Automatic Differentiation 6
Randomized Ensembled Double Q-Learning: Learning Fast Without a Model 5
Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments 4
Rao-Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator 3
Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets 5
Rapid Task-Solving in Novel Environments 2
Recurrent Independent Mechanisms 4
Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks 2
Refining Deep Generative Models via Discriminator Gradient Flow 5
Regularization Matters in Policy Optimization - An Empirical Study on Continuous Control 4
Regularized Inverse Reinforcement Learning 4
Reinforcement Learning with Random Delays 3
Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models 2
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting 5
Removing Undesirable Feature Contributions Using Out-of-Distribution Data 4
Representation Balancing Offline Model-based Reinforcement Learning 6
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components 4
Representation Learning via Invariant Causal Mechanisms 1
Representation learning for improved interpretability and classification accuracy of clinical factors from EEG 2
Representing Partial Programs with Blended Abstract Semantics 2
Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning 5
ResNet After All: Neural ODEs and Their Numerical Solution 5
Reset-Free Lifelong Learning with Skill-Space Planning 4
Rethinking Architecture Selection in Differentiable NAS 4
Rethinking Attention with Performers 6
Rethinking Embedding Coupling in Pre-trained Language Models 5
Rethinking Positional Encoding in Language Pre-training 5
Rethinking Soft Labels for Knowledge Distillation: A Bias–Variance Tradeoff Perspective 4
Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability 2
Retrieval-Augmented Generation for Code Summarization via Hybrid GNN 5
Return-Based Contrastive Representation Learning for Reinforcement Learning 3
Revisiting Dynamic Convolution via Matrix Decomposition 5
Revisiting Few-sample BERT Fine-tuning 5
Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction 5
Revisiting Locally Supervised Learning: an Alternative to End-to-end Training 5
Reweighting Augmented Samples by Minimizing the Maximal Expected Loss 5
Ringing ReLUs: Harmonic Distortion Analysis of Nonlinear Feedforward Networks 5
Risk-Averse Offline Reinforcement Learning 5
Robust Curriculum Learning: from clean label detection to noisy label self-correction 3
Robust Learning of Fixed-Structure Bayesian Networks in Nearly-Linear Time 1
Robust Overfitting may be mitigated by properly learned smoothening 5
Robust Pruning at Initialization 4
Robust Reinforcement Learning on State Observations with Learned Optimal Adversary 4
Robust and Generalizable Visual Representation Learning via Random Convolutions 5
Robust early-learning: Hindering the memorization of noisy labels 7
SAFENet: A Secure, Accurate and Fast Neural Network Inference 4
SALD: Sign Agnostic Learning with Derivatives 5
SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing 5
SEDONA: Search for Decoupled Neural Networks toward Greedy Block-wise Learning 6
SEED: Self-supervised Distillation For Visual Representation 5
SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments 2
SOLAR: Sparse Orthogonal Learned and Random Embeddings 5
SSD: A Unified Framework for Self-Supervised Outlier Detection 5
Saliency is a Possible Red Herring When Diagnosing Poor Generalization 4
SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization 5
Sample-Efficient Automated Deep Reinforcement Learning 4
Scalable Bayesian Inverse Reinforcement Learning 5
Scalable Learning and MAP Inference for Nonsymmetric Determinantal Point Processes 5
Scalable Transfer Learning with Expert Models 3
Scaling Symbolic Methods using Gradients for Neural Model Explanation 4
Scaling the Convex Barrier with Active Sets 5
Score-Based Generative Modeling through Stochastic Differential Equations 4
Selective Classification Can Magnify Disparities Across Groups 5
Selectivity considered harmful: evaluating the causal impact of class selectivity in DNNs 3
Self-Supervised Learning of Compressed Video Representations 5
Self-Supervised Policy Adaptation during Deployment 3
Self-supervised Adversarial Robustness for the Low-label, High-data Regime 4
Self-supervised Learning from a Multi-view Perspective 4
Self-supervised Representation Learning with Relative Predictive Coding 4
Self-supervised Visual Reinforcement Learning with Object-centric Representations 2
Self-training For Few-shot Transfer Across Extreme Task Differences 4
Semantic Re-tuning with Contrastive Tension 4
Semi-supervised Keypoint Localization 3
SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness 4
Separation and Concentration in Deep Networks 3
Seq2Tens: An Efficient Representation of Sequences by Low-Rank Tensor Projections 4
Sequential Density Ratio Estimation for Simultaneous Optimization of Speed and Accuracy 6
Set Prediction without Imposing Structure as Conditional Density Estimation 3
Shape or Texture: Understanding Discriminative Features in CNNs 4
Shape-Texture Debiased Neural Network Training 4
Shapley Explanation Networks 5
Shapley explainability on the data manifold 2
Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation 4
Sharper Generalization Bounds for Learning with Gradient-dominated Objective Functions 4
Sharpness-aware Minimization for Efficiently Improving Generalization 6
Signatory: differentiable computations of the signature and logsignature transforms, on both CPU and GPU 3
Simple Augmentation Goes a Long Way: ADRL for DNN Quantization 4
Simple Spectral Graph Convolution 4
Single-Photon Image Classification 2
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy 1
SkipW: Resource Adaptable RNN with Strict Upper Computational Limit 4
Sliced Kernelized Stein Discrepancy 3
Solving Compositional Reinforcement Learning Problems via Task Reduction 3
Sparse Quantized Spectral Clustering 2
Sparse encoding for more-interpretable feature-selecting representations in probabilistic matrix factorization 3
Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling 5
Spatially Structured Recurrent Modules 5
Spatio-Temporal Graph Scattering Transform 3
Stabilized Medical Image Attacks 4
Statistical inference for individual fairness 3
Stochastic Security: Adversarial Defense Using Long-Run Dynamics of Energy-Based Models 5
Structured Prediction as Translation between Augmented Natural Languages 5
Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning 3
Support-set bottlenecks for video-text representation learning 4
Symmetry-Aware Actor-Critic for 3D Molecular Design 5
Systematic generalisation with group invariant predictions 4
Taking Notes on the Fly Helps Language Pre-Training 5
Taming GANs with Lookahead-Minmax 4
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits 5
Task-Agnostic Morphology Evolution 3
Teaching Temporal Logics to Neural Networks 3
Teaching with Commentaries 5
Temporally-Extended ε-Greedy Exploration 4
Tent: Fully Test-Time Adaptation by Entropy Minimization 4
Text Generation by Learning from Demonstrations 6
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers 4
The Importance of Pessimism in Fixed-Dataset Policy Optimization 4
The Intrinsic Dimension of Images and Its Impact on Learning 2
The Recurrent Neural Tangent Kernel 3
The Risks of Invariant Risk Minimization 1
The Role of Momentum Parameters in the Optimal Convergence of Adaptive Polyak's Heavy-ball Methods 4
The Traveling Observer Model: Multi-task Learning Through Spatial Variable Embeddings 4
The Unreasonable Effectiveness of Patches in Deep Convolutional Kernels Methods 3
The geometry of integration in text classification RNNs 3
The inductive bias of ReLU networks on orthogonally separable data 2
The role of Disentanglement in Generalisation 3
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data 2
Theoretical bounds on estimation error for meta-learning 2
Tilted Empirical Risk Minimization 5
Tomographic Auto-Encoder: Unsupervised Bayesian Recovery of Corrupted Data 4
Topology-Aware Segmentation Using Discrete Morse Theory 3
Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis 4
Towards Impartial Multi-task Learning 5
Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding 3
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning 2
Towards Robust Neural Networks via Close-loop Control 4
Towards Robustness Against Natural Language Word Substitutions 5
Tradeoffs in Data Augmentation: An Empirical Study 4
Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs 3
Training GANs with Stronger Augmentations via Contrastive Discriminator 5
Training independent subnetworks for robust prediction 5
Training with Quantization Noise for Extreme Model Compression 4
Trajectory Prediction using Equivariant Continuous Convolution 4
Transformer protein language models are unsupervised structure learners 5
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning 3
TropEx: An Algorithm for Extracting Linear Terms in Deep Neural Networks 4
Trusted Multi-View Classification 4
UMEC: Unified model and embedding compression for efficient recommendation systems 6
UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers 3
Unbiased Teacher for Semi-Supervised Object Detection 4
Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs 5
Uncertainty Estimation in Autoregressive Structured Prediction 4
Uncertainty Sets for Image Classifiers using Conformal Prediction 5
Uncertainty in Gradient Boosting via Ensembles 4
Uncertainty-aware Active Learning for Optimal Bayesian Classifier 4
Understanding Over-parameterization in Generative Adversarial Networks 2
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning 4
Understanding and Improving Lexical Choice in Non-Autoregressive Translation 4
Understanding the effects of data parallelism and sparsity on neural network training 5
Understanding the failure modes of out-of-distribution generalization 3
Understanding the role of importance weighting for deep learning 1
Undistillable: Making A Nasty Teacher That CANNOT teach students 3
Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning 5
Universal approximation power of deep residual neural networks via nonlinear control theory 0
Unlearnable Examples: Making Personal Data Unexploitable 4
Unsupervised Audiovisual Synthesis via Exemplar Autoencoders 3
Unsupervised Discovery of 3D Physical Objects from Video 2
Unsupervised Meta-Learning through Latent-Space Interpolation in Generative Models 5
Unsupervised Object Keypoint Learning using Local Spatial Predictability 4
Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding 4
Usable Information and Evolution of Optimal Representations During Training 3
Using latent space regression to analyze and leverage compositionality in GANs 3
VA-RED$^2$: Video Adaptive Redundancy Reduction 6
VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models 4
VCNet and Functional Targeted Regularization For Learning Causal Effects of Continuous Treatments 4
VTNet: Visual Transformer Network for Object Goal Navigation 3
Variational Information Bottleneck for Effective Low-Resource Fine-Tuning 5
Variational Intrinsic Control Revisited 2
Variational State-Space Models for Localisation and Dense 3D Mapping in 6 DoF 5
Vector-output ReLU Neural Network Problems are Copositive Programs: Convex Analysis of Two Layer Networks and Polynomial-time Algorithms 5
Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images 4
Viewmaker Networks: Learning Views for Unsupervised Representation Learning 5
Vulnerability-Aware Poisoning Mechanism for Online RL with Unknown Dynamics 4
WaNet - Imperceptible Warping-based Backdoor Attack 5
Wandering within a world: Online contextualized few-shot learning 4
Wasserstein Embedding for Graph Learning 6
Wasserstein-2 Generative Networks 5
Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration 3
WaveGrad: Estimating Gradients for Waveform Generation 5
What Can You Learn From Your Muscles? Learning Visual Representation from Human Interactions 3
What Makes Instance Discrimination Good for Transfer Learning? 3
What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study 4
What Should Not Be Contrastive in Contrastive Learning 3
What are the Statistical Limits of Offline RL with Linear Function Approximation? 1
What they do when in doubt: a study of inductive biases in seq2seq learners 3
When Do Curricula Work? 6
When Optimizing $f$-Divergence is Robust with Label Noise 4
When does preconditioning help or hurt generalization? 2
Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets? 3
Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients 2
Winning the L2RPN Challenge: Power Grid Management via Semi-Markov Afterstate Actor-Critic 4
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching 6
WrapNet: Neural Net Inference with Ultra-Low-Precision Arithmetic 4
X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback 3
You Only Need Adversarial Supervision for Semantic Image Synthesis 5
Zero-Cost Proxies for Lightweight NAS 5
Zero-shot Synthesis with Group-Supervised Learning 3
gradSim: Differentiable simulation for system identification and visuomotor control 5
not-MIWAE: Deep Generative Modelling with Missing not at Random Data 3