Towards More Robust Interpretation via Local Gradient Alignment
Authors: Sunghwan Joo, SeokHyeon Jeong, Juyeon Heo, Adrian Weller, Taesup Moon
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As a result, we experimentally show that models trained with our method produce much more robust interpretations on CIFAR-10 and Image Net-100 without significantly hurting the accuracy, compared to the recent baselines. |
| Researcher Affiliation | Academia | 1 Department of ECE, Sungkyunkwan University 2 Department of ECE, Seoul National University 3 ASRI/INMC/IPAI/AIIS, Seoul National University 4 University of Cambridge 5 The Alan Turing Institute |
| Pseudocode | No | The paper does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/joshua840/Robust AGA. |
| Open Datasets | Yes | We used the CIFAR10 (Krizhevsky and Hinton 2009) and Image Net100 (Shekhar 2021; Russakovsky et al. 2015) datasets |
| Dataset Splits | Yes | The Image Net100 dataset is a subset of the Image Net-1k dataset with 100 of the 1K labels selected. The train and test dataset contains 1.3K and 50 images for each class, respectively. |
| Hardware Specification | No | The paper mentions 'memory and time requirements' in Table 2 but does not specify any particular hardware models (e.g., GPU, CPU models, or memory amounts) used for the experiments. |
| Software Dependencies | No | The paper mentions using common libraries or frameworks implicitly (e.g., 'Res Net18'), but it does not specify any software names with version numbers required to replicate the experiment. |
| Experiment Setup | Yes | We replaced all Re LU activations with Softplus(β = 3) as given by (Dombrowski et al. 2019). We set ϵ as 4, 8, and 16. We performed Monte-Carlo sampling 10 times for each data tuple to approximate the expectation over δx. For AAM in our experiments, we used PGD-ℓ (iter = 100) where ϵ = 2/255 for Res Net18 and ϵ = 4/255 for Le Net. We sampled δx only once per iteration. before calculating Γℓ2 gy(x, δx) and Γcos gy(x, δx) in (5), we treat gy(x) as constant to prevent the gradient flows during back-propagation. |