Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains
Authors: Kyungeun Lee, Ye Seul Sim, Hyeseung Cho, Moonjung Eo, Suhee Yoon, Sanghyu Yoon, Woohyung Lim
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive evaluations across diverse tabular datasets corroborate that our method consistently improves tabular representation learning performance for a wide range of downstream tasks. Our empirical investigations ascertain several advantages of binning: capturing the irregular function, compatibility with encoder architecture and additional modifications, standardizing all features into equal sets, grouping similar values within a feature, and providing ordering information. |
| Researcher Affiliation | Industry | 1LG AI Research, Seoul, Repulic of Korea. Correspondence to: Woohyung Lim <EMAIL>. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The codes are available in https://github. com/kyungeun-lee/tabularbinning. |
| Open Datasets | Yes | In this study, we use 25 public datasets mostly from the Open ML (Vanschoren et al., 2014) library, including the frequently used datasets in previous studies (Yoon et al., 2020; Ucar et al., 2021; Gorishniy et al., 2021; 2022). We summarize the main properties of datasets in Table 4. |
| Dataset Splits | Yes | For all datasets, we apply standardization for numerical features and labels for evaluating the regression tasks. Each dataset has exactly one train-validation-test split, so all algorithms use the same splits as the previous studies (Gorishniy et al., 2021; 2022; Rubachev et al., 2022). |
| Hardware Specification | Yes | All experiments are conducted on a single NVIDIA Ge Force RTX 3090. |
| Software Dependencies | No | The paper mentions "Optimizer: Adam W" but does not specify version numbers for programming languages, machine learning frameworks (e.g., PyTorch, TensorFlow), or other key software libraries used for implementation. |
| Experiment Setup | Yes | For the hyperparameters related to SSL, we tried pm {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9} and the number of bins T {2, 5, 10, 20, 50, 100}. Optimizer: Adam W, Learning rate: 1e-4, Weight decay: 1e-5, Epochs: 1000, Learning rate scheduler: Cosine annealing scheduler. We summarize the best setups for all datasets as follows. Table 9: Training setups for the best cases in Table 7. |