reproducibilityindex.ai

Concept Activation Regions: A Generalized Framework For Concept-Based Explanations

Authors: Jonathan Crabbé, Mihaela van der Schaar

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	3 Experiments The code to reproduce all the experiments from this section is available at https://github.com/ Jonathan Crabbe/CARs and https://github.com/vanderschaarlab/CARs. 3.1 Empirical Evaluation Our purpose is to empirically validate the formalism described in the previous section. We have several independent components to evaluate: 1 the concept classifier used to detect the CARs Hc, 2 the global explanations induced by the TCAR values and 3 the feature importance scores induced by the concept densities c. Datasets. We perform our experiments on 3 datasets.
Researcher Affiliation	Academia	Jonathan Crabbé University of Cambridge jc2133@cam.ac.uk Mihaela van der Schaar University of Cambridge The Alan Turing Institute UCLA mv472@cam.ac.uk
Pseudocode	Yes	The implementation of our method closely follows Algorithms 1, 2 and 4 in the appendices.
Open Source Code	Yes	The code to reproduce all the experiments from this section is available at https://github.com/ Jonathan Crabbe/CARs and https://github.com/vanderschaarlab/CARs.
Open Datasets	Yes	Datasets. We perform our experiments on 3 datasets. 1 The MNIST dataset [56]... 2 The MIT-BIH Electrocardiogram (ECG) dataset [57, 58]... 3 The Caltech-UCSD Birds-200 (CUB) dataset [59]... We use the data collected with the Surveillance, Epidemiology, and End Results (SEER) Program. The dataset [69]...
Dataset Splits	Yes	We train a multilayer perceptron (MLP) to predict the patient s mortality on 90% of the data and test on the remaining 10%. The classifier is then evaluated by computing its accuracy on a holdout balanced concept set T c of size 100 sampled from the model s testing set.
Hardware Specification	No	Our computing resources are described in Appendices E and F. These appendices are not included in the provided text, so specific hardware details are not explicitly described in the main body.
Software Dependencies	No	The paper mentions using Python, Scikit-learn, PyTorch, and Captum, and refers to software dependencies being described in Appendices E and F. However, the provided text does not specify version numbers for these software components. For example, [71] cites 'Scikit-learn: Machine learning in Python' and [72] cites 'Captum: A unified and generic model interpretability library for Py Torch' without specific version numbers for Scikit-learn, PyTorch, or Captum.
Experiment Setup	Yes	For several of those latent spaces, we fit our CAR classifier (SVC with radial basis function kernel) to discriminate the concept sets Pc, N c for each concept c [C]. These two sets have a size N c = 200 and are sampled from the model s training set. We train a multilayer perceptron (MLP)... We fit a CAR classifier (SVC with linear kernel) to discriminate the concepts sets Pc, N c for each grade c [5]. Both of those sets contain N c = 250 patients sampled from the training set.