Deep Neural Networks Tend To Extrapolate Predictably
Authors: Katie Kang, Amrith Setlur, Claire Tomlin, Sergey Levine
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present results showing this phenomenon across 8 datasets with different distributional shifts (including CIFAR10-C and Image Net-R, S), different loss functions (cross entropy, MSE, and Gaussian NLL), and different architectures (CNNs and transformers).Our experiments show that the amount of distributional shift correlates strongly with the distance between model outputs and the OCS across 8 datasets, including both vision and NLP domains, 3 loss functions, and for both CNNS and transformers. |
| Researcher Affiliation | Academia | Katie Kang1, Amrith Setlur2, Claire Tomlin1, Sergey Levine1 1UC Berkeley, 2Carnegie Mellon University |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code: https://github.com/katiekang1998/cautious_extrapolation |
| Open Datasets | Yes | The datasets with discrete labels include CIFAR10 (Krizhevsky et al., 2009), Image Net (Deng et al., 2009) (subsampled to 200 classes to match Image Net-R(rendition) (Hendrycks et al., 2021)), Domain Bed Office Home (Gulrajani & Lopez-Paz, 2020), BREEDS LIVING-17 and NON-LIVING26 (Santurkar et al., 2020), and Wilds Amazon (Koh et al., 2021), and the datasets with continuous labels include Skin Lesion Pixels (Gustafsson et al., 2023) and UTKFace (Zhang et al., 2017). |
| Dataset Splits | No | The paper describes methods for evaluating OOD scores and comparing model predictions to OCS, but it does not specify explicit train/validation/test dataset splits with percentages, sample counts, or references to predefined splits for the main models used in experiments. |
| Hardware Specification | No | No specific hardware details (such as GPU or CPU models, or cloud computing instance types) are provided in the paper. |
| Software Dependencies | No | The paper mentions software components like 'Res Net', 'VGG', and 'Distil BERT Tokenizer' but does not specify version numbers for these or other software libraries/frameworks used. |
| Experiment Setup | Yes | First, we will discuss the parameters we used to train our models. Task Network Architecture: MNIST 2 convolution layers followed by 2 fully connected layers Re LU nonlinearities, CIFAR10 Res Net20, Image Net Res Net50, Office Home Res Net50, BREEDS Res Net18, Amazon Distil BERT, Skin Lesion Pixels Res Net34, UTKFace Custom VGG style architecture. Task Optimizer: MNIST Adam 0.001, CIFAR10 SGD 0.1, Image Net SGD 0.1, Office Home Adam 0.00005, BREEDS SGD 0.2, Amazon Adam W 0.00001, Skin Lesion Pixels Adam 0.001, UTKFace Adam 0.001. Learning Rate Scheduler, Weight Decay, Momentum and Data Preprocessing are also listed in tables. |