Towards a fuller understanding of neurons with Clustered Compositional Explanations

Authors: Biagio La Rosa, Leilani Gilpin, Roberto Capobianco

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we propose a generalization, called Clustered Compositional Explanations, that combines Compositional Explanations with clustering and a novel search heuristic to approximate a broader spectrum of the neuron behavior. We define and address the problems connected to the application of these methods to multiple ranges of activations, analyze the insights retrievable by using our algorithm, and propose desiderata qualities that can be used to study the explanations returned by different algorithms. ... Table 1 compares the average number of states visited during the computation of the baselines and our MMESH. ... For the experiments in this section, we follow the same setup of Mu and Andreas [30] with the addition of the set of thresholds used to compute the activation ranges. We fix the length of the explanation to 3, as commonly done in the current literature [28, 27]. For space reasons, in almost all the experiments in this section, we report the results using the last layer of Res Net18 [17] as a base model and Ade20k [44, 46] as a concept dataset.
Researcher Affiliation Collaboration Biagio La Rosa Sapienza University of Rome Rome, IT 00185 larosa@diag.uniroma1.it Leilani H. Gilpin University of California, Santa Cruz Santa Cruz, CA 95060 lgilpin@ucsc.edu Roberto Capobianco Sony AI Schlieren, CH 8952 roberto.capobianco@sony.com
Pseudocode No The paper describes the algorithms (e.g., Clustered Compositional Explanations, MMESH) using mathematical formulas and descriptive text, but it does not include a formally labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code Yes The code is available at https://github.com/KRLGroup/Clustered-Compositional-Explanations.
Open Datasets Yes For the experiments in this section, we follow the same setup of Mu and Andreas [30] with the addition of the set of thresholds used to compute the activation ranges. ... we report the results using the last layer of Res Net18 [17] as a base model and Ade20k [44, 46] as a concept dataset. ... All the models considered in this paper have been pre-trained on the Place365 dataset [45]. ... Annotation for Pascal [12] and Ade20K [44] datasets are retrieved from the Broden dataset [2]. ... Table 7 and Table 8 compare Net Dissect, Co Ex, and Clustered Compositional Explanations when the Res Net18 [17] and VGG-16 [41] are pretrained on the Image Net dataset [9].
Dataset Splits Yes For the experiments in this section, we follow the same setup of Mu and Andreas [30] with the addition of the set of thresholds used to compute the activation ranges.
Hardware Specification Yes Timing collected using a workstation powered by an NVIDIA Ge Force RTX-3090 graphic card.
Software Dependencies No The paper mentions models like 'Res Net18' and 'Dense Net161', and a clustering algorithm 'K-Means', but does not specify versions for any ancillary software dependencies (e.g., Python, PyTorch, TensorFlow, or specific library versions).
Experiment Setup Yes For the experiments in this section, we follow the same setup of Mu and Andreas [30] with the addition of the set of thresholds used to compute the activation ranges. We fix the length of the explanation to 3, as commonly done in the current literature [28, 27]. ... We use K-Means as a clustering algorithm and fix the number of clusters to five. ... we use a beam of size 10 only during the first beam, and then we set the beam size to 5 to replicate the configuration of Mu and Andreas [30].