reproducibilityindex.ai

Explaining Predictive Uncertainty with Information Theoretic Shapley Values

Authors: David Watson, Joshua O'Hara, Niek Tax, Richard Mudd, Ido Guy

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement model-specific and model-agnostic variants of our method and illustrate their performance in a range of simulated and real-world experiments, with applications to feature selection, covariate shift detection, and active learning.
Researcher Affiliation	Collaboration	David S. Watson King s College London david.watson@kcl.ac.uk Joshua O Hara King s College London Niek Tax Meta, Central Applied Science Richard Mudd Meta, Central Applied Science Ido Guy Meta, Central Applied Science
Pseudocode	No	The paper does not contain any clearly labeled "Pseudocode" or "Algorithm" blocks, nor does it present structured steps formatted like code.
Open Source Code	Yes	Code for all experiments and figures can be found in our dedicated Git Hub repository.3
Open Datasets	Yes	The MNIST dataset is available online.5 The IMDB dataset is available on Kaggle.6 For the tabular data experiment in Sect. 6.1, we generate Y according to the following process: µ(x) := β x, σ2(x) := exp(γ x), Y \| x N µ(x), σ2(x) . Coefficients β, γ are independent Rademacher distributed random vectors of length 4. The Breast Cancer, Diabetes, Ionosphere, and Sonar datasets are all distributed in the mlbench package, which is available on CRAN.7
Dataset Splits	Yes	Partition n training samples {(x(i), y(i))}n i=1 D into two equal sized subsets I1, I2 where I1 is used for model fitting and I2 for computing Shapley values. We start with four binary classification datasets from the UCI machine learning repository [17] Breast Cancer, Diabetes, Ionosphere, and Sonar and make a random 80/20 train/test split on each.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA A100), CPU models (e.g., Intel Core i7), or cloud instance types used for running its experiments.
Software Dependencies	No	The paper mentions software like PyTorch, BERT (Hugging Face transformers library), and XGBoost, but does not provide specific version numbers for these dependencies, which are required for a reproducible description of ancillary software.
Experiment Setup	Yes	For the MNIST experiment, we train a deep neural network with the following model architecture: (1) A convolutional layer with 10 filters of size 5 5, followed by max pooling of size 2 2, Re LU activation, and a dropout layer with probability 0.3. (2) A convolutional layer with 20 filters of size 5 5, followed by a dropout layer, max pooling of size 2 2, Re LU activation, and a dropout layer with probability 0.3. (3) Fully connected (dense) layer with 320 input features and 50 output units, followed by Re LU activation and a dropout layer. (4) Fully connected layer with 50 input features and 10 output units, followed by softmax activation. We train with a batch size of 128 for 20 epochs at a learning rate of 0.01 and momentum 0.5. For Monte Carlo dropout, we do 50 forward passes to sample B = 50 subnetworks.