Detecting Statistical Interactions from Neural Network Weights
Authors: Michael Tsang, Dehua Cheng, Yan Liu
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTS In this section, we discuss our experiments on both simulated and real-world datasets to study the performance of our approach on interaction detection. |
| Researcher Affiliation | Academia | Michael Tsang, Dehua Cheng, Yan Liu Department of Computer Science University of Southern California {tsangm,dehuache,yanliu.cs}@usc.edu |
| Pseudocode | Yes | Algorithm 1 NID Greedy Ranking Algorithm |
| Open Source Code | No | The paper does not contain an explicit statement or link confirming the availability of the source code for the methodology described. |
| Open Datasets | Yes | We use four real-world datasets, of which two are regression datasets, and the other two are binary classification datasets. ... the cal housing dataset ... (Pace & Barry, 1997). The bike sharing dataset ... (Fanaee-T & Gama, 2014). The higgs boson dataset ... (Adam-Bourdarios et al., 2014). Lastly, the letter recognition dataset ... (Frey & Slate, 1991). |
| Dataset Splits | Yes | In all synthetic experiments, we used random train/valid/test splits of 1/3 each on 30k data points. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like 'Re LU activation', 'backpropagation', 'MLP', 'MLP-M', and 'lasso', but does not provide specific version numbers for any libraries, frameworks, or solvers used. |
| Experiment Setup | Yes | In our experiments, all networks that model feature interactions consisted of four hidden layers with first-to-last layer sizes of: 140, 100, 60, and 20 units. In contrast, all individual univariate networks had three hidden layers with sizes of: 10, 10, and 10 units. All networks used Re LU activation and were trained using backpropagation. ... On the synthetic test suite, MLP and MLP-M were trained with L1 constants in the range of 5e-6 to 5e-4, based on parameter tuning on a validation set. On real-world datasets, L1 was fixed at 5e-5. MLP-Cutoff used a fixed L2 constant of 1e-4 in all experiments involving cutoff. Early stopping was used to prevent overfitting. |