STaSy: Score-based Tabular data Synthesis

Authors: Jayoung Kim, Chaejeong Lee, Noseong Park

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Furthermore, we also conduct rigorous experimental studies in terms of the generative task trilemma: sampling quality, diversity, and time. In our experiments with 15 benchmark tabular datasets and 7 baselines, our method outperforms existing methods in terms of task-dependant evaluations and diversity.
Researcher Affiliation Academia Jayoung Kim, Chaejeong Lee, and Noseong Park Department of Artificial Intelligence Yonsei University Seoul, South Korea {jayoung.kim, chaejeong_lee, noseong}@yonsei.ac.kr
Pseudocode Yes Algorithm 1 shows the overall training process for our STa Sy.
Open Source Code Yes Source codes used in the experiments are available in the supplementary material. By following the README guidance, the main results are easily reproducible.
Open Datasets Yes The raw data of 15 datasets are available online: Credit: https://www.kaggle.com/mlg-ulb/creditcardfraud (Db CL 1.0) ... Spambase: https://archive.ics.uci.edu/ml/datasets/spambase (CC BY 4.0)
Dataset Splits Yes The train-test split ratio is 80% and 20%, respectively.
Hardware Specification Yes Our software and hardware environments are as follows: UBUNTU 18.04 LTS, PYTHON 3.8.2, PYTORCH 1.8.1, CUDA 11.4, and NVIDIA Driver 470.42.01, i9 CPU, and NVIDIA RTX 3090.
Software Dependencies Yes Our software and hardware environments are as follows: UBUNTU 18.04 LTS, PYTHON 3.8.2, PYTORCH 1.8.1, CUDA 11.4, and NVIDIA Driver 470.42.01, i9 CPU, and NVIDIA RTX 3090.
Experiment Setup Yes Hyperparameter settings for the best models are in Table 27. We have three SDE types, which are VE, VP, and sub-VP, and three layer types as shown in Appendix C: Concat, Squash, and Concatsquash. We use a learning rate in {2e 03, 2e 04}. We search for α0 and β0, in total, with 9 combinations using α0 = {0.20, 0.25, 0.30} and β0 = {0.80, 0.90, 0.95}.