Improving Compositionality of Neural Networks by Decoding Representations to Inputs

Authors: Mike Wu, Noah Goodman, Stefano Ermon

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, we demonstrate applications of this uncertainty to out-of-distribution detection, adversarial example detection, and calibration while matching standard neural networks in accuracy. We further explore this compositionality by combining Dec NN with pretrained models, where we show promising results that neural networks can be regularized from using protected features.
Researcher Affiliation Academia Mike Wu, Noah Goodman, Stefano Ermon Department of Computer Science Stanford University Stanford, CA 94303 {wumike,ngoodman,ermon}@stanford.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described, nor does it explicitly state that code will be released.
Open Datasets Yes Method MNIST Fashion Celeb A (Table 1a header), We utilize the Audio MNIST [3] and Fluent speech commands [27] datasets
Dataset Splits No Table 1a shows accuracies over a held-out test set, averaged over three runs.
Hardware Specification Yes Table 10: Cost (seconds) of 1 epoch on a Titan X GPU (averaged over 10 epochs).
Software Dependencies Yes Blitz Bayesian neural networks [7] and PyTorch Lightning [8] for all of our models. [7] Piero Esposito. Blitz bayesian layers in torch zoo (a bayesian deep learing library for torch). https://github.com/pi Esposito/blitz-bayesian-deep-learning/, 2020. [8] et al. Falcon, WA. Pytorch lightning. Git Hub. Note: https://github.com/Py Torch Lightning/pytorch-lightning, 3, 2019.
Experiment Setup Yes The hyperparameter β > 0 is used to scale the auxiliary loss. We introduce a new hyperparameter α [0, 1] that geometrically downweights later depths. For the experiments above, we chose the recursive depth to be equal to the number of layers in the MLP (D = L = 8).