Fast Axiomatic Attribution for Neural Networks
Authors: Robin Hesse, Simone Schaub-Meyer, Stefan Roth
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Various experiments demonstrate the advantages of X-DNNs, beating state-of-the-art generic attribution methods on regular DNNs for training with attribution priors. |
| Researcher Affiliation | Academia | Robin Hesse1 Simone Schaub-Meyer1 Stefan Roth1,2 1Department of Computer Science, TU Darmstadt 2hessian.AI |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and additional resources at https://visinf.github.io/fast-axiomatic-attribution/. |
| Open Datasets | Yes | For our experiments on models for image classification... we use the Image Net [24] dataset, containing about 1.2 million images of 1000 different categories. [...] To that end, we employ the public NHANES I survey data [17] of the CDC of the United States, containing 118 one-hot encoded medical attributes, e.g., age, sex, and vital sign measurements, from 13,000 human subjects. |
| Dataset Splits | Yes | We train on the training split and report numbers for the validation split. [...] we randomly subsample 200 training and validation datasets containing 100 data points from the original dataset. |
| Hardware Specification | No | The paper mentions 'Using a single GPU' but does not specify the model or other detailed hardware specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or library versions). |
| Experiment Setup | Yes | For our experiments on models for image classification... we use the Image Net [24] dataset [...]. If not indicated otherwise, we assume numerical convergence for Integrated Gradients and Expected Gradients, which we found to occur after 128 approximation steps (see supplemental material). [...] A simple MLP with Re LU activations is used as the model. [...] we randomly subsample 200 training and validation datasets containing 100 data points from the original dataset. |