reproducibilityindex.ai

TransTab: Learning Transferable Tabular Transformers Across Tables

Authors: Zifeng Wang, Jimeng Sun

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we aim at answering the following questions by extensive experiments: Q1. How does Trans Tab perform compared with baselines under the vanilla supervised setting? Q2. How well does Trans Tab address incremental columns from a stream of data (S(2) in Fig. 1)? Q3. How is the impact of Trans Tab learned from multiple tables (with different columns) drawn from the same domain on its predictive ability (S(1) in Fig. 1)? Q4. Can Trans Tab be a zero-shot learner when pretrained on tables and infer on a new table (S(4) in Fig. 1)? Q5. Is the proposed vertical partition CL better than vanilla supervised pretraining and selfsupervised CL (S(3) in Fig. 1)?
Researcher Affiliation	Academia	Zifeng Wang1 and Jimeng Sun1,2 1 Department of Computer Science, University of Illinois Urbana-Champaign 2 Carle Illinois College of Medicine, University of Illinois Urbana-Champaign
Pseudocode	No	The paper describes the method using text and diagrams, but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Our package is available at https://github.com/Ryan Wang Zf/transtab with documentation at https://transtab.readthedocs.io/en/latest/.
Open Datasets	Yes	We introduce clinical trial mortality prediction datasets where each includes a distinct group of patients and columns 3. The data statistics are in Table 1. Footnote 3 links to: https://data.projectdatasphere.org/projectdatasphere/html/access
Dataset Splits	No	The paper mentions 'A patience of 10 is kept for supervised training for early stopping' which implies the use of a validation set, but does not explicitly provide the split percentages or methodology for train/validation/test splits for the datasets.
Hardware Specification	Yes	Experiments were conducted with one RTX3070 GPU, i7-10700 CPU, and 16GB RAM.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and cites 'Pytorch' in the references, but it does not specify version numbers for these or other software dependencies used in the experiments.
Experiment Setup	Yes	Trans Tab uses 2 layers of gated transformers where the embedding dimensions of numbers and tokens are 128, and the hidden dimension of intermediate dense layers is 256. The attention module has 8 heads. We choose ReLU activations and do not activate dropout. We train Trans Tab using Adam optimizer [27] with learning rate in {2e-5, 5e-5, 1e-4} and no weight decay; batch size is in {16, 64, 128}. We set a maximum self-supervised pretraining epochs of 50 and supervised training epochs of 100. A patience of 10 is kept for supervised training for early stopping.