Net-DNF: Effective Deep Modeling of Tabular Data

Authors: Liran Katzir, Gal Elidan, Ran El-Yaniv

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present extensive experiments showing that Net-DNFs significantly and consistently outperform fully connected networks over tabular data. With relatively few hyperparameters, Net-DNFs open the door to practical end-to-end handling of tabular data using neural networks. We present ablation studies, which justify the design choices of Net-DNF including the inductive bias elements, namely, Boolean formulation, locality, and feature selection.
Researcher Affiliation Collaboration Liran Katzir lirank@google.com Gal Elidan elidan@google.com Ran El-Yaniv rani@cs.technion.ac.il
Pseudocode Yes Algorithm 1: Grid Search Procedure Input: model, configurations_list results_list = [ ] for i=1 to n_partitions do val_scores_list = [ ] test_scores_list = [ ] train, val, test = read_data(partition_index=i) for c in configurations_list do trained_model = model.train(train_data=train, val_data=val, configuration=c) trained_model.load_weights_from_best_epoch() val_score = trained_model.predict(data=val) test_score = trained_model.predict(data=test) val_scores_list.append(val_score) test_scores_list.append(test_score) end best_val_index = get_index_of_best_val_score(val_scores_list) test_res = test_scores_list[best_val_index] results_list.append(test_res) end mean = mean(results_list) sem = standard_error_of_the_mean(results_list) Return: mean, sem
Open Source Code Yes Our code is available at https://github.com/amramabutbul/DisjunctiveNormalFormNet.
Open Datasets Yes The datasets used in this study are from Kaggle competitions and Open ML (Vanschoren et al., 2014). A summary of these datasets appears in Appendix C. Table 4: A description of the tabular datasets. Otto Group 93 9 61.9k Kaggle kaggle.com/c/otto-group-product-classification-challenge/overview, Gesture Phase 32 5 9.8k Open ML openml.org/d/4538
Dataset Splits Yes Each dataset was first randomly divided into five folds in a way that preserved the original distribution. Then, based on these five folds, we created five partitions of the dataset as follows. Each fold is used as the test set in one of the partitions, while the other folds are used as the training and validation sets. This way, each partition was 20% test, 10% validation, and 70% training.
Hardware Specification Yes All models were trained on GPUs Titan Xp 12GB RAM.
Software Dependencies No The paper mentions 'Tensorflow' as the implementation framework and 'Py Torch' for Tab Net, but does not provide specific version numbers for these or any other key software dependencies.
Experiment Setup Yes All results presented in this work were obtained using a massive grid search for optimizing each model s hyperparameters. A detailed description of the grid search process with additional details can be found in Appendices D.1, D.2. For Net-DNF we used an initial learning rate of 0.05. For FCN, we added the initial learning rate to the grid search with values of {0.05, 0.005, 0.0005}. Appendix D.3 provides extensive details on the grid parameters for Net-DNF, XGBoost, and FCN, including learning rates, max depth, dropout, and L2 lambda values.