An Additive Instance-Wise Approach to Multi-class Model Interpretation

Authors: Vy Vo, Van Nguyen, Trung Le, Quan Hung Tran, Gholamreza Haffari, Seyit Camtepe, Dinh Phung

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also demonstrate the capacity to select stable and important features through extensive experiments on various data sets and black-box model architectures. We conducted experiments on various machine learning classification tasks.
Researcher Affiliation Collaboration 1Monash University, Australia 2Adobe Research, USA 3CSIRO s Data61, Australia 4Vin AI Research, Vietnam
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Code and data for reproducing our experiments are published at https://github.com/ is Vy08/AIM/.
Open Datasets Yes Sentiment Analysis: The Large Movie Review Dataset IMDB (Maas et al., 2011) consists of 50, 000 movie reviews with positive and negative sentiments. Hate Speech Detection: Hate Xplain is an annotated dataset of Twitter and Gab posts for hate speech detection (Mathew et al., 2021). Topic Classification: AG News corpus (Zhang et al., 2015) is constructed by selecting 4 largest classes from the original dataset. The MNIST and Fashion-MNIST dataset respectively consist of 28 28 gray-scale images of handwritten digits and article clothing images. We experimented with two real-world datasets: Admission (Acharya et al., 2019) and Adult (Kohavi et al., 1996).
Dataset Splits Yes Table 6: Dataset statistics and hyperparameters. Dataset Train/Dev/Test No. of features α β IMDB 25000/20000/5000 400 1.8 1e 3 Hate Xplain 15000/4119/1029 200 0.1 1e 3 AG News 120000/6080/1520 400 0.1 1e 4 MNIST 14000/4623/3147 16 0.5 1e 3 Fashion-MNIST 15000/3000/3000 16 0.5 1e 3
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments. While Figure 5 mentions 'a single CPU' in the context of LIME's processing time, this is not a general hardware specification for the authors' experimental setup.
Software Dependencies No The paper mentions software like 'NLTK package (Bird, 2006)' and 'Hugging Face (Wolf et al., 2019)' but does not provide specific version numbers for these or other key software components (e.g., Python, PyTorch/TensorFlow libraries).
Experiment Setup Yes We parametrize E, S and G by three deep neural network functions. The explainer E passes the embedded inputs into three 250-dimensional dense layers and outputs W after applying Re LU non-linearity. The selector S is composed of one bidirectional LSTM of 100 dimension and three dense layers of the same size. Each layer is stacked between a Dropout layer and an activation. The upper layers use Re LU while Sigmoid is a natural choice for the final one. Regarding the network G, after feeding the inputs into its own embedding layer, we process the outputs through a 250-dimensional convolutional layer with kernel size 3, followed by a max-pooling layer over the sequence length. The last layer is a dense layer of dimension 250 together with Softmax activation. We use the same architecture for all tasks and train our model with Adam optimizer at τ = 0.2 and a learning rate of 0.001. We tune the coefficients α, β via grid search to achieve an adequate balance of faithfulness and compression. For every dataset, we tune α and β via grid search with values in {0.1, 0.5, 1, 1.5, 1.8, 2} and {1e 2, 1e 3, 1e 4} respectively, and the setting that yields the highest Faithfulness is selected. Table 6 details data splits and best hyperparameters used in our experiments for 3 text classification (IMDB / Hate Xplain / AG News) and 2 image recognition tasks (MNIST / Fashion-MNIST).