Concept Activation Regions: A Generalized Framework For Concept-Based Explanations

Authors: Jonathan Crabbé, Mihaela van der Schaar

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 3 Experiments The code to reproduce all the experiments from this section is available at https://github.com/ Jonathan Crabbe/CARs and https://github.com/vanderschaarlab/CARs. 3.1 Empirical Evaluation Our purpose is to empirically validate the formalism described in the previous section. We have several independent components to evaluate: 1 the concept classifier used to detect the CARs Hc, 2 the global explanations induced by the TCAR values and 3 the feature importance scores induced by the concept densities c. Datasets. We perform our experiments on 3 datasets.
Researcher Affiliation Academia Jonathan Crabbé University of Cambridge jc2133@cam.ac.uk Mihaela van der Schaar University of Cambridge The Alan Turing Institute UCLA mv472@cam.ac.uk
Pseudocode Yes The implementation of our method closely follows Algorithms 1, 2 and 4 in the appendices.
Open Source Code Yes The code to reproduce all the experiments from this section is available at https://github.com/ Jonathan Crabbe/CARs and https://github.com/vanderschaarlab/CARs.
Open Datasets Yes Datasets. We perform our experiments on 3 datasets. 1 The MNIST dataset [56]... 2 The MIT-BIH Electrocardiogram (ECG) dataset [57, 58]... 3 The Caltech-UCSD Birds-200 (CUB) dataset [59]... We use the data collected with the Surveillance, Epidemiology, and End Results (SEER) Program. The dataset [69]...
Dataset Splits Yes We train a multilayer perceptron (MLP) to predict the patient s mortality on 90% of the data and test on the remaining 10%. The classifier is then evaluated by computing its accuracy on a holdout balanced concept set T c of size 100 sampled from the model s testing set.
Hardware Specification No Our computing resources are described in Appendices E and F. These appendices are not included in the provided text, so specific hardware details are not explicitly described in the main body.
Software Dependencies No The paper mentions using Python, Scikit-learn, PyTorch, and Captum, and refers to software dependencies being described in Appendices E and F. However, the provided text does not specify version numbers for these software components. For example, [71] cites 'Scikit-learn: Machine learning in Python' and [72] cites 'Captum: A unified and generic model interpretability library for Py Torch' without specific version numbers for Scikit-learn, PyTorch, or Captum.
Experiment Setup Yes For several of those latent spaces, we fit our CAR classifier (SVC with radial basis function kernel) to discriminate the concept sets Pc, N c for each concept c [C]. These two sets have a size N c = 200 and are sampled from the model s training set. We train a multilayer perceptron (MLP)... We fit a CAR classifier (SVC with linear kernel) to discriminate the concepts sets Pc, N c for each grade c [5]. Both of those sets contain N c = 250 patients sampled from the training set.