Detecting Extrapolation with Local Ensembles

Authors: David Madras, James Atwood, Alexander D'Amour

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we show that our method is capable of detecting when a pretrained model is extrapolating on test data, with applications to out-of-distribution detection, detecting spurious correlates, and active learning.
Researcher Affiliation Collaboration David Madras University of Toronto Vector Institute madras@cs.toronto.edu James Atwood Google Brain atwoodj@google.com Alex D Amour Google Brain alexdamour@google.com
Pseudocode Yes B.1 LANCZOS ALGORITHM CODE SNIPPET. Figure 9: Example Python implementation of Lanczos algorithm for tridiagonalizing an implicit matrix M.
Open Source Code Yes Code for running the local ensembles method can be found at https://github.com/dmadras/local-ensembles.
Open Datasets Yes Boston (Harrison Jr & Rubinfeld, 1978) and Diabetes (Efron et al., 2004). These datasets were loaded from Scikit-Learn (Pedregosa et al., 2011). Abalone (Nash et al., 1994). This dataset was downloaded from the UCI repository (Dua & Graff, 2017) at http://archive.ics.uci.edu/ml/datasets/Abalone. Wine Quality (Cortez et al., 2009). This dataset was downloaded from the UCI repository (Dua & Graff, 2017) at http://archive.ics.uci.edu/ml/datasets/Wine+Quality. We use MNIST (Le Cun et al., 2010) and Fashion MNIST (Xiao et al., 2017) for our active learning experiments. We use the Celeb A dataset (Liu et al., 2015) of celebrity faces
Dataset Splits Yes We sample the validation set randomly as 20% of the training set.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions software like Python, NumPy, Scikit-learn, and TensorFlow Datasets, but it does not provide specific version numbers for these dependencies to ensure reproducibility.
Experiment Setup Yes We train a two-layer neural network with 3 hidden units in each layer and tanh units. We train for 400 optimization steps using minibatch size 32. We use batch size 64, patience 100 and a 100-step running average window for estimating current performance. For the Lanczos iteration, we run up to 2000 iterations. We use batch size 32, patience 100 steps, and a 100-step running average window for estimating current performance. We use two convolutional layers with 16 and 32 layers, stride size 5, and a dense layer on top with 64 units. We trained all models with mean squared error loss.