SmoothHess: ReLU Network Feature Interactions via Stein's Lemma
Authors: Max Torop, Aria Masoomi, Davin Hill, Kivanc Kose, Stratis Ioannidis, Jennifer Dy
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate the superior flexibility of Smooth Hess to capture interactions on MNIST, FMNIST, and CIFAR10. We utilize Smooth Hess to derive insights into a network trained on a real-world medical spirometry dataset. |
| Researcher Affiliation | Collaboration | 1 Northeastern University, 2 Memorial Sloan Kettering Cancer Center {torop.m, masoomi.a}@northeastern.edu, {dhill, ioannidis, jdy}@ece.neu.edu, {kosek}@mskcc.org |
| Pseudocode | Yes | Algorithm 1 Joint Smooth Hess and Smooth Grad Estimation |
| Open Source Code | Yes | Our code is publicly available. https://github.com/MaxTorop/SmoothHess |
| Open Datasets | Yes | Experiments were conducted on a real-world spirometry regression dataset, three image datasets (MNIST [45], FMNIST [90] and CIFAR10 [43]), and one synthetic dataset (Four Quadrant). |
| Dataset Splits | Yes | We further split the train set into 50, 000 images for training and 10, 000 for validation. [...] We further split the train set into 40, 000 images for training and 10, 000 for validation. |
| Hardware Specification | Yes | Experiments were performed on an internal cluster using NVIDIA A100 GPUs and AMD EPYC223 7302 16-Core processors. |
| Software Dependencies | No | The paper mentions software like TensorFlow [1] and JAX [13] implicitly through citations for background, and PyTorch [62] for automatic differentiation. However, it does not specify version numbers for these software components or any other libraries/solvers. |
| Experiment Setup | Yes | Training lasted for 40, 000 iterations with a batch size of 128 and a starting learning rate of 1e-3 which was decayed by a factor of 1e-1 at iterations 5000, 10, 000 and 20, 000. |