Evaluation of Similarity-based Explanations

Authors: Kazuaki Hanawa, Sho Yokoi, Satoshi Hara, Kentaro Inui

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments revealed that the cosine similarity of the gradients of the loss performs best, which would be a recommended choice in practice. ... For this evaluation, we used two image datasets (MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky, 2009)), two text datasets (TREC (Li & Roth, 2002), AGNews (Zhang et al., 2015)) and two table datasets (Vehicle (Dua & Graff, 2017), Segment (Dua & Graff, 2017)).
Researcher Affiliation Academia Kazuaki Hanawa1,2, Sho Yokoi2,1, Satoshi Hara3, Kentaro Inui2,1 RIKEN Center for Advanced Intelligence Project1, Tohoku University2, Osaka University3
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Procedures are described in narrative text.
Open Source Code Yes Our implementation is available at https://github.com/k-hanawa/criteria_for_instance_based_explanation
Open Datasets Yes For this evaluation, we used two image datasets (MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky, 2009)), two text datasets (TREC (Li & Roth, 2002), AGNews (Zhang et al., 2015)) and two table datasets (Vehicle (Dua & Graff, 2017), Segment (Dua & Graff, 2017)).
Dataset Splits No The paper mentions training on a subset of training instances and then sampling test instances ('randomly sample 500 test instances from the test set'), but does not explicitly describe a separate validation set split for hyperparameter tuning or model selection.
Hardware Specification Yes In our experiments, training of the models was run on a NVIDIA GTX 1080 GPU with Intel Xeon Silver 4112 CPU and 64GB RAM. Testing and computing relevance metrics were run on Xeon E5-2680 v2 CPU with 256GB RAM.
Software Dependencies No The paper mentions using the 'Adam optimizer' and specific types of models (CNN, Bi-LSTM, logistic regression), but it does not specify software components with version numbers (e.g., Python, PyTorch, TensorFlow, or CUDA versions) required to reproduce the experiments.
Experiment Setup Yes We trained the models using the Adam optimizer with a learning rate of 0.001.