reproducibilityindex.ai

Neural Basis Models for Interpretability

Authors: Filip Radenovic, Abhimanyu Dubey, Dhruv Mahajan

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments
Researcher Affiliation	Industry	Source code is available at github.com/facebookresearch/nbm-spam.
Pseudocode	No	The paper describes its architecture and methodology using mathematical equations and diagrams (e.g., Figure 1) but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	Yes	Source code is available at github.com/facebookresearch/nbm-spam.
Open Datasets	Yes	Tabular datasets. We report performance on CA Housing [10, 46], FICO [22], Cover Type [8, 16, 20], and Newsgroups [32, 43] tabular datasets. ... We also report performance on MIMIC-II [41, 51], Credit [17, 19], Click [15], Epsilon [21], Higgs [3, 26], Microsoft [40, 49], Yahoo [60], and Year [66] tabular datasets. ... Image datasets (classiﬁcation). We experiment with two bird classiﬁcation datasets: CUB [18, 62] and i Naturalist Birds [27, 59]. ... Image dataset (object detection). For this task we use a proprietary object detection dataset, denoted as Common Objects
Dataset Splits	Yes	Data is split to have 70/10/20 ratio for training, validation, and, testing, respectively; except for Newsgroups where test split is ﬁxed, so we only split the train part to 85/15 ratio for train and validation. ... For these datasets, we follow [12, 47] to use the same training, validation, and, testing splits
Hardware Specification	Yes	Linear, NAM, NBM, and MLP models are trained using the Adam with decoupled weight decay (Adam W) optimizer [35], on 8 V100 GPU machines with 32 GB memory, and a batch size of at most 1024 per GPU (divided by 2 every time a batch cannot ﬁt in the memory). ... The throughput is measured as the number of input instances that we can process per second (x / sec) on one 32 GB V100 GPU, in inference mode. ... Finally, for EBMs and XGBoost, CPU machines are used
Software Dependencies	No	We implement the following baselines in Py Torch [48]... The paper mentions using PyTorch but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	We use mean squared error (MSE) for regression, and cross-entropy loss for classiﬁcation. To avoid overﬁtting, we use the following regularization techniques: (i) L2-normalization (weight decay) [31] of parameters; (ii) batch-norm [28] and dropout [54] on hidden layers of the basis functions network; (iii) an L2-normalization penalty on the outputs fi to incentivize fewer strong feature contributions, as done in [2]; (iv) basis dropout to randomly drop individual basis functions in order to decorrelate them. ... MLP containing 3 hidden layers with 256, 128, and 128 units, Re LU [23], B = 100 basis outputs for NBMs and B = 200 for NB2Ms. ... Linear, NAM, NBM, and MLP models are trained using the Adam with decoupled weight decay (Adam W) optimizer [35], on 8 V100 GPU machines with 32 GB memory, and a batch size of at most 1024 per GPU (divided by 2 every time a batch cannot ﬁt in the memory). We train for 1,000, 500, 100, or, 50 epochs, depending on the size and feature dimensionality of the dataset. The learning rate is decayed with cosine annealing [34] from the starting value until zero. For NBMs on all datasets, we tune the starting learning rate in the continuous interval [1e 5, 1.0), weight decay in the interval [1e 10, 1.0), output penalty coefﬁcient in the interval [1e 7, 100), dropout and basis dropout coefﬁcients in the discrete set {0, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}. We ﬁnd optimal hyper-parameters using validation set and random search.