Scalable Interpretability via Polynomials
Authors: Abhimanyu Dubey, Filip Radenovic, Dhruv Mahajan
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present a new class of GAMs that use tensor rank decompositions of polynomials to learn powerful, inherently-interpretable models. Our approach, titled Scalable Polynomial Additive Models (SPAM) is effortlessly scalable and models all higher-order feature interactions without a combinatorial parameter explosion. SPAM outperforms all current interpretable approaches, and matches DNN/XGBoost performance on a series of real-world benchmarks with up to hundreds of thousands of features. We demonstrate by human subject evaluations that SPAMs are demonstrably more interpretable in practice, and are hence an effortless replacement for DNNs for creating interpretable and high-performance systems suitable for large-scale machine learning. |
| Researcher Affiliation | Industry | Abhimanyu Dubey Meta AI dubeya@fb.com Filip Radenovic Meta AI filipradenovic@fb.com Dhruv Mahajan Meta AI dhruvm@fb.com |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code is available at github.com/facebookresearch/nbm-spam. |
| Open Datasets | Yes | Our datasets are summarized in Table 2. Please see Appendix Section C.1 for details. Table 2 lists: California Housing (CH) Pace and Barry [1997], FICO [2018], Cover Type (Cov Type) Blackard and Dean [1999], Newsgroups Lang [1995]. Other datasets mentioned are CUB-200 [Wah et al., 2011] and i Naturalist Birds [Van Horn et al., 2018, 2021]. |
| Dataset Splits | Yes | For all datasets with no defined train-val-test split, we use a fixed random sample of 70% of the data for training, 10% for validation and 20% for testing. For the 20 Newsgroups dataset, we split the pre-defined training split 7:1 for training and validation, respectively. |
| Hardware Specification | No | The paper mentions "GPU acceleration" in Contribution 1 but does not provide specific details such as GPU model, CPU type, or memory, that would allow for hardware replication. |
| Software Dependencies | No | SPAM is implemented by learning L1/L2-regularized variants by minibatch SGD implemented in Py Torch. However, no specific version numbers for PyTorch or any other software dependencies are provided in the main text. |
| Experiment Setup | No | The paper states: "We tune hyperparameters via random sampling approach over a grid." and "For definitions of metrics and hyperparameter ranges, see Appendix Section C." However, the provided text does not contain the concrete values of these hyperparameters or the grid details, which are deferred to the Appendix. |