Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Advancing Post-Hoc Case-Based Explanation with Feature Highlighting

Authors: Eoin M. Kenny, Eoin Delaney, Mark T. Keane

IJCAI 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Results demonstrate that the proposed approach appropriately calibrates a user s feelings of correctness for ambiguous classifications in real world data on the Image Net dataset, an effect which does not happen when just showing the explanation without feature highlighting.
Researcher Affiliation	Academia	Eoin M. Kenny1 , Eoin Delaney2,4 and Mark T. Keane2,3,4 1CSAIL, Massachusetts Institute of Technology 2University College Dublin 3Insight Centre for Data Analytics 4Vista Milk SFI Research Centre
Pseudocode	Yes	Algorithm 1 Latent-Based Require: f(.); CNN to-be-explained Require: I; Test Image Require: D; Training Dataset Require: m(.); Activation map algorithm (e.g., FAM) ... Algorithm 2 Superpixel-Based Require: f(.); ANN to-be-explained Require: I; Test Image Require: D; Training Dataset Require: S(.); Superpixel Algorithm ...
Open Source Code	Yes	1Code available at https://github.com/EoinKenny/IJCAI-2023
Open Datasets	Yes	Two datasets CUB-200 [Welinder et al., 2010] and Image Net [Deng et al., 2009] were used
Dataset Splits	Yes	Tests used the first 500 validation images. ... The 24 misclassifications were randomly divided into two material sets (A-set and B-set) to counterbalance the experiment;
Hardware Specification	Yes	Gathering the data for Fig. 3 took two months on two Nvidia v100 GPUs.
Software Dependencies	No	The paper mentions software like 'ResNet34', 'ResNet50', 'LIME', 'CAM', 'FAMs' but does not provide specific version numbers for these or other software dependencies required for reproduction.
Experiment Setup	Yes	The purpose of this experiment is to isolate the best hyperparameter values for α and β in equations 3 and 4, respectively. ... For each hyperparamter value, the networks were finetuned for 2500 iterations and test-accuracy sampled every 50... For hyperparameters, latent-based CAM should use α=5... Superpixel segmentation of 30 and β= in superpixels is recommended as it generalizes best.