On Embeddings for Numerical Features in Tabular Deep Learning
Authors: Yury Gorishniy, Ivan Rubachev, Artem Babenko
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we empirically evaluate the techniques discussed in section 3 and compare them with Gradient Boosted Decision Trees to check the status quo of the DL vs GBDT competition. |
| Researcher Affiliation | Collaboration | Yury Gorishniy Yandex Ivan Rubachev HSE, Yandex Artem Babenko Yandex |
| Pseudocode | No | The paper provides mathematical formulations (e.g., Equation 1) and describes procedures in prose, but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code is available at https://github.com/Yura52/tabular-dl-num-embeddings. |
| Open Datasets | Yes | We use eleven public datasets mostly from the previous works on tabular DL and Kaggle competitions. [from Section 4.1] and We use publicly available datasets. [from Questions for Paper Analysis, 4e] |
| Dataset Splits | Yes | The dataset is split into three disjoint parts: 1, n = Jtrain Jval Jtest, where the train part is used for training, the validation part is used for early stopping and hyperparameter tuning, and the test part is used for the final evaluation. |
| Hardware Specification | No | The paper states: 'The experiment reports included in the supplementary material provide the information about the used hardware and execution times.' (Questions for Paper Analysis, 3d). This indicates the hardware specifications are in supplementary materials, not explicitly within the main paper. |
| Software Dependencies | No | The paper does not explicitly provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) in the main text. |
| Experiment Setup | Yes | The PLE hyperparameters are the same for all features. For quantile-based PLE, we tune the number of quantiles. For target-aware PLE, we tune the following parameters for decision trees: the maximum number of leaves, the minimum number of items per leaf, and the minimum information gain required for making a split when growing the tree. For the Periodic module (see Equation 2), we tune σ and k (these hyperparameters are the same for all features). |