Improving Compositionality of Neural Networks by Decoding Representations to Inputs
Authors: Mike Wu, Noah Goodman, Stefano Ermon
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we demonstrate applications of this uncertainty to out-of-distribution detection, adversarial example detection, and calibration while matching standard neural networks in accuracy. We further explore this compositionality by combining Dec NN with pretrained models, where we show promising results that neural networks can be regularized from using protected features. |
| Researcher Affiliation | Academia | Mike Wu, Noah Goodman, Stefano Ermon Department of Computer Science Stanford University Stanford, CA 94303 {wumike,ngoodman,ermon}@stanford.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, nor does it explicitly state that code will be released. |
| Open Datasets | Yes | Method MNIST Fashion Celeb A (Table 1a header), We utilize the Audio MNIST [3] and Fluent speech commands [27] datasets |
| Dataset Splits | No | Table 1a shows accuracies over a held-out test set, averaged over three runs. |
| Hardware Specification | Yes | Table 10: Cost (seconds) of 1 epoch on a Titan X GPU (averaged over 10 epochs). |
| Software Dependencies | Yes | Blitz Bayesian neural networks [7] and PyTorch Lightning [8] for all of our models. [7] Piero Esposito. Blitz bayesian layers in torch zoo (a bayesian deep learing library for torch). https://github.com/pi Esposito/blitz-bayesian-deep-learning/, 2020. [8] et al. Falcon, WA. Pytorch lightning. Git Hub. Note: https://github.com/Py Torch Lightning/pytorch-lightning, 3, 2019. |
| Experiment Setup | Yes | The hyperparameter β > 0 is used to scale the auxiliary loss. We introduce a new hyperparameter α [0, 1] that geometrically downweights later depths. For the experiments above, we chose the recursive depth to be equal to the number of layers in the MLP (D = L = 8). |