InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation

Authors: Jacob Yoke Hong Si, Wendy Yusi Cheng, Michael Cooper, Rahul Krishnan

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through comprehensive experiments on real-world datasets, we demonstrate that Interpre Tab Net outperforms previous methods for interpreting tabular data while attaining competitive accuracy.
Researcher Affiliation Academia 1University of Toronto 2Vector Institute. Correspondence to: Jacob Si <jacobyhsi@cs.toronto.edu>.
Pseudocode Yes Algorithm 1 Our proposed algorithm for interpretability optimization. Good default settings for the tested machine learning problems are α = 0, β = [0, 10000000], δ = [0.20, 0.25], γ = [2, 3] ϵ = [3, 5]. For β, δ and γ, it would depend on the nature of the dataset. More samples require higher parameter values.
Open Source Code Yes The code is available on Git Hub at: https://github.com/jacobyhsi/InterpreTabNet
Open Datasets Yes The real-world tabular datasets we use in our experiments are from the UCI Machine Learning Repository (Kelly et al., 2023) and Open ML (Vanschoren et al., 2013). ... Table 6: Dataset Links
Dataset Splits Yes The training/validation/testing proportion of the datasets for each split is 80/10/10% apart from the Higgs dataset. Due to the inherently large Higgs dataset, we adhere to Tab Net's method of data splitting with 500k training samples, 100k validation samples, and 100k testing samples.
Hardware Specification No The paper discusses computational efficiency and training time (e.g., 'several-minute increase in training time') but does not specify any hardware components like GPU models, CPU types, or other specific computational resources used for the experiments.
Software Dependencies Yes The GPT-4 version used in our experiments is 'gpt-4-1106' with training data up to Apr 2023 and a context window of 128,000 tokens.
Experiment Setup Yes Hyperparameters such as Nd = Na, Nsteps, γ, and learning rate are tuned in the range per Tab Net's recommendations. In terms of the sparsity regularizer for Interpre Tab Net, r M, we recommend a smaller range e.g. [0,10000] for datasets with a low to moderate number of features and samples (Adult dataset), and a larger range e.g. [0, 1,000,000,000,000] for datasets with a larger number of features and samples (Higgs dataset). Table 7: Hyperparameter spaces for all models