Polynomial-based Self-Attention for Table Representation Learning

Authors: Jayoung Kim, Yehjin Shin, Jeongwhan Choi, Hyowon Wi, Noseong Park

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments with three representative table learning models equipped with our proposed layer, we illustrate that the layer effectively mitigates the oversmoothing problem and enhances the representation performance of the existing methods, outperforming the state-of-the-art table representation methods.
Researcher Affiliation Academia 1Yonsei University, South Korea 2KAIST, South Korea.
Pseudocode No The paper presents mathematical equations and descriptions of the proposed method but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Source codes used in the experiments are available in the supplementary material. By following the README guidance, the main results are easily reproducible.
Open Datasets Yes The download links for each dataset are as follows: Income: https://www.kaggle.com/lodetomasi1995/income-classification [... and other links]
Dataset Splits Yes The general statistics of datasets are listed in Table 6. Dataset Task (# class) # Features # Continuous # Categorical Dataset Size # Train set # Valid set # Test set [... showing specific numbers for each split for all datasets]
Hardware Specification Yes Our software and hardware environments are as follows: UBUNTU 20.04 LTS, PYTHON 3.8.2, PYTORCH 1.8.1, CUDA 11.4, and NVIDIA Driver 470.42.01, i9 CPU, and NVIDIA RTX A5000.
Software Dependencies Yes Our software and hardware environments are as follows: UBUNTU 20.04 LTS, PYTHON 3.8.2, PYTORCH 1.8.1, CUDA 11.4, and NVIDIA Driver 470.42.01, i9 CPU, and NVIDIA RTX A5000.
Experiment Setup Yes We use 8 hyperparameters including depth of Transformer, embedding dimensions, learning rate, the number of heads, the value of weight decay, hidden dimension of mlp layer, polynomial type, and k. Best hyperparameters are in Table 7. [...] Tables 7, 8, 9 provide detailed hyperparameter settings for different models and datasets.