reproducibilityindex.ai

Deep Neural Networks Tend To Extrapolate Predictably

Authors: Katie Kang, Amrith Setlur, Claire Tomlin, Sergey Levine

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present results showing this phenomenon across 8 datasets with different distributional shifts (including CIFAR10-C and Image Net-R, S), different loss functions (cross entropy, MSE, and Gaussian NLL), and different architectures (CNNs and transformers).Our experiments show that the amount of distributional shift correlates strongly with the distance between model outputs and the OCS across 8 datasets, including both vision and NLP domains, 3 loss functions, and for both CNNS and transformers.
Researcher Affiliation	Academia	Katie Kang1, Amrith Setlur2, Claire Tomlin1, Sergey Levine1 1UC Berkeley, 2Carnegie Mellon University
Pseudocode	No	The paper does not contain any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Code: https://github.com/katiekang1998/cautious_extrapolation
Open Datasets	Yes	The datasets with discrete labels include CIFAR10 (Krizhevsky et al., 2009), Image Net (Deng et al., 2009) (subsampled to 200 classes to match Image Net-R(rendition) (Hendrycks et al., 2021)), Domain Bed Office Home (Gulrajani & Lopez-Paz, 2020), BREEDS LIVING-17 and NON-LIVING26 (Santurkar et al., 2020), and Wilds Amazon (Koh et al., 2021), and the datasets with continuous labels include Skin Lesion Pixels (Gustafsson et al., 2023) and UTKFace (Zhang et al., 2017).
Dataset Splits	No	The paper describes methods for evaluating OOD scores and comparing model predictions to OCS, but it does not specify explicit train/validation/test dataset splits with percentages, sample counts, or references to predefined splits for the main models used in experiments.
Hardware Specification	No	No specific hardware details (such as GPU or CPU models, or cloud computing instance types) are provided in the paper.
Software Dependencies	No	The paper mentions software components like 'Res Net', 'VGG', and 'Distil BERT Tokenizer' but does not specify version numbers for these or other software libraries/frameworks used.
Experiment Setup	Yes	First, we will discuss the parameters we used to train our models. Task Network Architecture: MNIST 2 convolution layers followed by 2 fully connected layers Re LU nonlinearities, CIFAR10 Res Net20, Image Net Res Net50, Office Home Res Net50, BREEDS Res Net18, Amazon Distil BERT, Skin Lesion Pixels Res Net34, UTKFace Custom VGG style architecture. Task Optimizer: MNIST Adam 0.001, CIFAR10 SGD 0.1, Image Net SGD 0.1, Office Home Adam 0.00005, BREEDS SGD 0.2, Amazon Adam W 0.00001, Skin Lesion Pixels Adam 0.001, UTKFace Adam 0.001. Learning Rate Scheduler, Weight Decay, Momentum and Data Preprocessing are also listed in tables.