Regularizing Black-box Models for Improved Interpretability

Authors: Gregory Plumb, Maruan Al-Shedivat, Ángel Alexander Cabrera, Adam Perer, Eric Xing, Ameet Talwalkar

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that post-hoc explanations for EXPO-regularized models have better explanation quality, as measured by the common fidelity and stability metrics. We verify that improving these metrics leads to significantly more useful explanations with a user study on a realistic task.4 Experimental Results
Researcher Affiliation Collaboration Gregory Plumb Carnegie Mellon University gdplumb@andrew.cmu.edu Maruan Al-Shedivat Carnegie Mellon University alshedivat@cs.cmu.edu Ángel Alexander Cabrera Carnegie Mellon University cabrera@cmu.edu Adam Perer Carnegie Mellon University adamperer@cmu.edu Eric Xing CMU, Petuum Inc epxing@cs.cmu.edu Ameet Talwalkar CMU, Determined AI talwalkar@cmu.edu
Pseudocode Yes Algorithm 1 Neighborhood-fidelity regularizer
Open Source Code Yes 1https://github.com/GDPlumb/ExpO
Open Datasets Yes seven regression problems from the UCI collection [Dheeru and Karra Taniskidou, 2017], the MSD dataset4, and Support2 which is an in-hospital mortality classification problem5. Dataset statistics are in Table 2. 4As in [Bloniarz et al., 2016], we treat the MSD dataset as a regression problem with the goal of predicting the release year of a song. 5http://biostat.mc.vanderbilt.edu/wiki/Main/Support Desc.
Dataset Splits No The paper mentions evaluating on test data and discusses parameters for neighborhoods (Nx and Nreg x), but it does not provide explicit train/validation/test dataset splits (percentages or counts) in the main text.
Hardware Specification Yes Each model takes less than a few minutes to train on an Intel 8700k CPU, so computational cost was not a limiting factor in our experiments.
Software Dependencies No The paper mentions using 'SGD with Adam [Kingma and Ba, 2014]' as an optimization algorithm, but it does not specify any software libraries, frameworks, or their version numbers (e.g., Python, TensorFlow, PyTorch, scikit-learn versions).
Experiment Setup Yes The network architectures and hyper-parameters are chosen using a grid search; for more details see Appendix A.3. For the final results, we set Nx to be N(x, σ) with σ = 0.1 and N reg x to be N(x, σ) with σ = 0.5.