SheetPT: Spreadsheet Pre-training Based on Hierarchical Attention Network
Authors: Ran Jia, Qiyu Li, Zihan Xu, Xiaoyuan Jin, Lun Du, Haoyu Dong, Xiao Lv, Shi Han, Dongmei Zhang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments In this section, we first describe the pre-training setup and then evaluate SHEETPT on two downstream tasks: formula prediction and sheet structure recognition. and Ablation Studies To verify the effectiveness of our design, we conduct comprehensive ablation studies on different components of SHEETPT and the pre-training objectives. Table 4 and Table 5(left) show the ablation results of pretraining objectives on formula prediction and sheet structure recognition. |
| Researcher Affiliation | Collaboration | Ran Jia1 , Qiyu Li2 , Zihan Xu2 , Xiaoyuan Jin2 , Lun Du1, Haoyu Dong1, Xiao Lv1, Shi Han1, Dongmei Zhang 1 1 Microsoft Research Asia 2 Peking University {raji, lun.du, hadong, xilv, shihan, dongmeiz}@microsoft.com, {liqiyu0728, xzhpku, jinxy}@pku.edu.cn |
| Pseudocode | No | The paper describes its model architecture and components but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not explicitly state that the source code for SHEETPT is publicly available, nor does it provide a link to a repository for its own implementation. |
| Open Datasets | Yes | Dataset. We follow Spreadsheet Coder (Chen et al. 2021) and For Tap(Cheng et al. 2021) to use Enron (Hermans and Murphy-Hill 2015) as the dataset for formula prediction. ... Dataset. We employ De Ex (Koci et al. 2019a), a widely used dataset for cell type classification of tabular data. |
| Dataset Splits | Yes | For Tap has published the dataset of training, validation, and test. and As for experiments on Sheet Sem, we split the dataset into 2,201 training sheets, 300 validation sheets and 300 test sheets. and We conduct a 5-fold cross-validation following the setup of For Tap on the De Ex dataset. |
| Hardware Specification | Yes | SHEETPT is implemented in Py Torch and uses distributed pre-training on Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify a version number. It also references 'Distil BERT' and 'Hugging Face' for baselines, but no specific version numbers for software dependencies are provided. |
| Experiment Setup | Yes | SHEETPT utilizes 6 layers of transformer encoder initialized with Distil BERT (Sanh et al. 2019) for processing token sequence in a cell. The number of layers of multi-grained hierarchical attention is also 6. ... We set the maximum number of rows and columns to 200 and 25... The batch size in both two phases is 8. |