reproducibilityindex.ai

Evaluation of Similarity-based Explanations

Authors: Kazuaki Hanawa, Sho Yokoi, Satoshi Hara, Kentaro Inui

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments revealed that the cosine similarity of the gradients of the loss performs best, which would be a recommended choice in practice. ... For this evaluation, we used two image datasets (MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky, 2009)), two text datasets (TREC (Li & Roth, 2002), AGNews (Zhang et al., 2015)) and two table datasets (Vehicle (Dua & Graff, 2017), Segment (Dua & Graff, 2017)).
Researcher Affiliation	Academia	Kazuaki Hanawa1,2, Sho Yokoi2,1, Satoshi Hara3, Kentaro Inui2,1 RIKEN Center for Advanced Intelligence Project1, Tohoku University2, Osaka University3
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. Procedures are described in narrative text.
Open Source Code	Yes	Our implementation is available at https://github.com/k-hanawa/criteria_for_instance_based_explanation
Open Datasets	Yes	For this evaluation, we used two image datasets (MNIST (Le Cun et al., 1998), CIFAR10 (Krizhevsky, 2009)), two text datasets (TREC (Li & Roth, 2002), AGNews (Zhang et al., 2015)) and two table datasets (Vehicle (Dua & Graff, 2017), Segment (Dua & Graff, 2017)).
Dataset Splits	No	The paper mentions training on a subset of training instances and then sampling test instances ('randomly sample 500 test instances from the test set'), but does not explicitly describe a separate validation set split for hyperparameter tuning or model selection.
Hardware Specification	Yes	In our experiments, training of the models was run on a NVIDIA GTX 1080 GPU with Intel Xeon Silver 4112 CPU and 64GB RAM. Testing and computing relevance metrics were run on Xeon E5-2680 v2 CPU with 256GB RAM.
Software Dependencies	No	The paper mentions using the 'Adam optimizer' and specific types of models (CNN, Bi-LSTM, logistic regression), but it does not specify software components with version numbers (e.g., Python, PyTorch, TensorFlow, or CUDA versions) required to reproduce the experiments.
Experiment Setup	Yes	We trained the models using the Adam optimizer with a learning rate of 0.001.