SPADE: Sparsity-Guided Debugging for Deep Neural Networks
Authors: Arshia Soltani Moakhar, Eugenia Iofinova, Elias Frantar, Dan Alistarh
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we experimentally validate the impact of SPADE on the usefulness and fidelity of network interpretations. |
| Researcher Affiliation | Collaboration | 1Institute of Science and Technology Austria (ISTA) 2Neural Magic. |
| Pseudocode | Yes | Algorithm 1 SPADE Procedure SPADE Algorithm(M, s, I) {M: Model, s: Sample, I: Interpretability Method} B Empty {Batch of Augmented samples} for Augmentation Batch Size do Append a random augmentation of s to B end for for Each layer in M do Xi Layer Inputi(B) Yi Layer Outputi(B) end for for Each layer in M do Wi argmin W sparse WXi Yi 2 2 Wi Wi {Replace weights with sparse ones} end for return I(M, s) {Interpretability method on M, s} |
| Open Source Code | Yes | Our code is available at https://github. com/IST-DASLab/SPADE. |
| Open Datasets | Yes | We concentrate primarily on the Image Net1K (Deng et al., 2009) dataset, with additional validations performed on the Celeb A (Liu et al., 2015) and Food101 (Bossard et al., 2014) datasets. |
| Dataset Splits | No | The paper mentions training, validation, and test data in the context of experiments, for example, stating 'Image Net validation set' (Table 10) or '1.2 million training examples' (Section 4.1). However, it does not provide explicit percentages or sample counts for the overall train/validation/test splits that would allow full reproduction of the data partitioning. |
| Hardware Specification | Yes | Timings computed on an NVIDA Ge FORCE GPU with 25Gi B RAM. |
| Software Dependencies | No | The paper mentions using 'the Captum library (Kokhlikyan et al., 2020) for saliency method implementations, except for LRP, for which we use (Nam et al., 2019).' However, it does not specify version numbers for these or other key software components, which is required for a reproducible description. |
| Experiment Setup | Yes | We use a 0.9 momentum and step-lr learning rate scheduler with a step-lr-gama 0.1 for all backdoorings and a weight decay of 0.0001. The initial learning rate is chosen from the options 0.01, 0.001, 0.0001, 0.00001 based on accuracy on Trojan samples at the end of training. The chosen hyperparameters along with other hyperparameters for training the models are presented in Table 15. |