reproducibilityindex.ai

Detecting Extrapolation with Local Ensembles

Authors: David Madras, James Atwood, Alexander D'Amour

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, we show that our method is capable of detecting when a pretrained model is extrapolating on test data, with applications to out-of-distribution detection, detecting spurious correlates, and active learning.
Researcher Affiliation	Collaboration	David Madras University of Toronto Vector Institute madras@cs.toronto.edu James Atwood Google Brain atwoodj@google.com Alex D Amour Google Brain alexdamour@google.com
Pseudocode	Yes	B.1 LANCZOS ALGORITHM CODE SNIPPET. Figure 9: Example Python implementation of Lanczos algorithm for tridiagonalizing an implicit matrix M.
Open Source Code	Yes	Code for running the local ensembles method can be found at https://github.com/dmadras/local-ensembles.
Open Datasets	Yes	Boston (Harrison Jr & Rubinfeld, 1978) and Diabetes (Efron et al., 2004). These datasets were loaded from Scikit-Learn (Pedregosa et al., 2011). Abalone (Nash et al., 1994). This dataset was downloaded from the UCI repository (Dua & Graff, 2017) at http://archive.ics.uci.edu/ml/datasets/Abalone. Wine Quality (Cortez et al., 2009). This dataset was downloaded from the UCI repository (Dua & Graff, 2017) at http://archive.ics.uci.edu/ml/datasets/Wine+Quality. We use MNIST (Le Cun et al., 2010) and Fashion MNIST (Xiao et al., 2017) for our active learning experiments. We use the Celeb A dataset (Liu et al., 2015) of celebrity faces
Dataset Splits	Yes	We sample the validation set randomly as 20% of the training set.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions software like Python, NumPy, Scikit-learn, and TensorFlow Datasets, but it does not provide specific version numbers for these dependencies to ensure reproducibility.
Experiment Setup	Yes	We train a two-layer neural network with 3 hidden units in each layer and tanh units. We train for 400 optimization steps using minibatch size 32. We use batch size 64, patience 100 and a 100-step running average window for estimating current performance. For the Lanczos iteration, we run up to 2000 iterations. We use batch size 32, patience 100 steps, and a 100-step running average window for estimating current performance. We use two convolutional layers with 16 and 32 layers, stride size 5, and a dense layer on top with 64 units. We trained all models with mean squared error loss.