STaSy: Score-based Tabular data Synthesis
Authors: Jayoung Kim, Chaejeong Lee, Noseong Park
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, we also conduct rigorous experimental studies in terms of the generative task trilemma: sampling quality, diversity, and time. In our experiments with 15 benchmark tabular datasets and 7 baselines, our method outperforms existing methods in terms of task-dependant evaluations and diversity. |
| Researcher Affiliation | Academia | Jayoung Kim, Chaejeong Lee, and Noseong Park Department of Artificial Intelligence Yonsei University Seoul, South Korea {jayoung.kim, chaejeong_lee, noseong}@yonsei.ac.kr |
| Pseudocode | Yes | Algorithm 1 shows the overall training process for our STa Sy. |
| Open Source Code | Yes | Source codes used in the experiments are available in the supplementary material. By following the README guidance, the main results are easily reproducible. |
| Open Datasets | Yes | The raw data of 15 datasets are available online: Credit: https://www.kaggle.com/mlg-ulb/creditcardfraud (Db CL 1.0) ... Spambase: https://archive.ics.uci.edu/ml/datasets/spambase (CC BY 4.0) |
| Dataset Splits | Yes | The train-test split ratio is 80% and 20%, respectively. |
| Hardware Specification | Yes | Our software and hardware environments are as follows: UBUNTU 18.04 LTS, PYTHON 3.8.2, PYTORCH 1.8.1, CUDA 11.4, and NVIDIA Driver 470.42.01, i9 CPU, and NVIDIA RTX 3090. |
| Software Dependencies | Yes | Our software and hardware environments are as follows: UBUNTU 18.04 LTS, PYTHON 3.8.2, PYTORCH 1.8.1, CUDA 11.4, and NVIDIA Driver 470.42.01, i9 CPU, and NVIDIA RTX 3090. |
| Experiment Setup | Yes | Hyperparameter settings for the best models are in Table 27. We have three SDE types, which are VE, VP, and sub-VP, and three layer types as shown in Appendix C: Concat, Squash, and Concatsquash. We use a learning rate in {2e 03, 2e 04}. We search for α0 and β0, in total, with 9 combinations using α0 = {0.20, 0.25, 0.30} and β0 = {0.80, 0.90, 0.95}. |