Detecting Extrapolation with Local Ensembles
Authors: David Madras, James Atwood, Alexander D'Amour
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, we show that our method is capable of detecting when a pretrained model is extrapolating on test data, with applications to out-of-distribution detection, detecting spurious correlates, and active learning. |
| Researcher Affiliation | Collaboration | David Madras University of Toronto Vector Institute madras@cs.toronto.edu James Atwood Google Brain atwoodj@google.com Alex D Amour Google Brain alexdamour@google.com |
| Pseudocode | Yes | B.1 LANCZOS ALGORITHM CODE SNIPPET. Figure 9: Example Python implementation of Lanczos algorithm for tridiagonalizing an implicit matrix M. |
| Open Source Code | Yes | Code for running the local ensembles method can be found at https://github.com/dmadras/local-ensembles. |
| Open Datasets | Yes | Boston (Harrison Jr & Rubinfeld, 1978) and Diabetes (Efron et al., 2004). These datasets were loaded from Scikit-Learn (Pedregosa et al., 2011). Abalone (Nash et al., 1994). This dataset was downloaded from the UCI repository (Dua & Graff, 2017) at http://archive.ics.uci.edu/ml/datasets/Abalone. Wine Quality (Cortez et al., 2009). This dataset was downloaded from the UCI repository (Dua & Graff, 2017) at http://archive.ics.uci.edu/ml/datasets/Wine+Quality. We use MNIST (Le Cun et al., 2010) and Fashion MNIST (Xiao et al., 2017) for our active learning experiments. We use the Celeb A dataset (Liu et al., 2015) of celebrity faces |
| Dataset Splits | Yes | We sample the validation set randomly as 20% of the training set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions software like Python, NumPy, Scikit-learn, and TensorFlow Datasets, but it does not provide specific version numbers for these dependencies to ensure reproducibility. |
| Experiment Setup | Yes | We train a two-layer neural network with 3 hidden units in each layer and tanh units. We train for 400 optimization steps using minibatch size 32. We use batch size 64, patience 100 and a 100-step running average window for estimating current performance. For the Lanczos iteration, we run up to 2000 iterations. We use batch size 32, patience 100 steps, and a 100-step running average window for estimating current performance. We use two convolutional layers with 16 and 32 layers, stride size 5, and a dense layer on top with 64 units. We trained all models with mean squared error loss. |