SmoothHess: ReLU Network Feature Interactions via Stein's Lemma

Authors: Max Torop, Aria Masoomi, Davin Hill, Kivanc Kose, Stratis Ioannidis, Jennifer Dy

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically validate the superior flexibility of Smooth Hess to capture interactions on MNIST, FMNIST, and CIFAR10. We utilize Smooth Hess to derive insights into a network trained on a real-world medical spirometry dataset.
Researcher Affiliation Collaboration 1 Northeastern University, 2 Memorial Sloan Kettering Cancer Center {torop.m, masoomi.a}@northeastern.edu, {dhill, ioannidis, jdy}@ece.neu.edu, {kosek}@mskcc.org
Pseudocode Yes Algorithm 1 Joint Smooth Hess and Smooth Grad Estimation
Open Source Code Yes Our code is publicly available. https://github.com/MaxTorop/SmoothHess
Open Datasets Yes Experiments were conducted on a real-world spirometry regression dataset, three image datasets (MNIST [45], FMNIST [90] and CIFAR10 [43]), and one synthetic dataset (Four Quadrant).
Dataset Splits Yes We further split the train set into 50, 000 images for training and 10, 000 for validation. [...] We further split the train set into 40, 000 images for training and 10, 000 for validation.
Hardware Specification Yes Experiments were performed on an internal cluster using NVIDIA A100 GPUs and AMD EPYC223 7302 16-Core processors.
Software Dependencies No The paper mentions software like TensorFlow [1] and JAX [13] implicitly through citations for background, and PyTorch [62] for automatic differentiation. However, it does not specify version numbers for these software components or any other libraries/solvers.
Experiment Setup Yes Training lasted for 40, 000 iterations with a batch size of 128 and a starting learning rate of 1e-3 which was decayed by a factor of 1e-1 at iterations 5000, 10, 000 and 20, 000.