Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Re-calibrating Feature Attributions for Model Interpretation
Authors: Peiyu Yang, NAVEED AKHTAR, Zeyi Wen, Mubarak Shah, Ajmal Saeed Mian
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, quantitative evaluation is performed with pixel perturbation (Samek et al., 2016) and Diff ROAR (Shah et al., 2021) on Image Net-2012 validation set (Russakovsky et al., 2015), CIFAR-100 and CIFAR-10 (Krizhevsky et al., 2009). |
| Researcher Affiliation | Academia | Peiyu Yang1, Naveed Akhtar1, Zeyi Wen2,3, Mubarak Shah4, and Ajmal Mian1 1The University of Western Australia 2Hong Kong University of Science and Technology (Guangzhou) 3Hong Kong University of Science and Technology 4University of Central Florida |
| Pseudocode | Yes | Algorithm 1: Attribution Re-Calibration |
| Open Source Code | Yes | Our code is available at https://github.com/ypeiyu/attribution_recalibration |
| Open Datasets | Yes | Image Net-2012 validation set (Russakovsky et al., 2015), CIFAR-100 and CIFAR-10 (Krizhevsky et al., 2009). |
| Dataset Splits | Yes | Image Net-2012 training dataset. |
| Hardware Specification | Yes | All the experiments were conducted on a Linux machine with an NVIDIA GTX 3090Ti GPU with 24GB memory and a 16-core 3.9GHz Intel Core i9-12900K CPU and 125GB main memory. |
| Software Dependencies | Yes | All attribution methods are tested and trained on Py Torch deep learning framework (v1.12.1) with Python language. |
| Experiment Setup | Yes | For the experiments on the Image Net-2012 dataset (Russakovsky et al., 2015), we select 10 references and 5 interpolations (k=5) for IG-Uniform, IG-SG and IG-SQ. Besides, we chose 50 references and one random interpolation (k=1) for EG, and set 10 interpolations (k=10) and 5 class-specific adversarial references for AGI as the recommended settings in these papers. For a fair comparison, we ensure all these methods use the same number of 50 back propagations. Considering the small input size of CIFAR-10 and CIFAR-100 datasets (Krizhevsky et al., 2009), we employed 30 back propagations for all the baseline attribution methods. ...Then, the Pre Act Res Net-18 is fine-tuned for a total of 10 epochs with the initial learning rate 10 2 decayed by 10 on the 5-th and 7-th epochs. |