SADGA: Structure-Aware Dual Graph Aggregation Network for Text-to-SQL
Authors: Ruichu Cai, Jinjie Yuan, Boyan Xu, Zhifeng Hao
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to study the effectiveness of SADGA. Especially, SADGA outperforms the baseline methods and achieves 3rd place on the challenging Text-to-SQL benchmark Spider 2 [34] at the time of writing. |
| Researcher Affiliation | Academia | 1 School of Computer Science, Guangdong University of Technology, Guangzhou, China 2 Peng Cheng Laboratory, Shenzhen, China 3 College of Science, Shantou University, Shantou, China |
| Pseudocode | No | The paper describes methods in text and uses figures for illustration, but no explicit pseudocode or algorithm blocks are present. |
| Open Source Code | Yes | Our implementation will be open-sourced at https://github.com/DMIRLAB-Group/SADGA. |
| Open Datasets | Yes | In this section, we conduct experiments on the Spider dataset [34], the benchmark of cross-domain Text-to-SQL, to evaluate the effectiveness of our model. |
| Dataset Splits | Yes | The Spider has been so far the most challenging benchmark on cross-domain Text-to-SQL, which contains 9 traditional specific-domain datasets, such as ATIS [8], Geo Query [36], Wiki SQL [1], IMDB [30] etc. It is split into the train set (8659 examples), development set (1034 examples) and test set (2147 examples), which are respectively distributed across 146, 20 and 40 databases. |
| Hardware Specification | Yes | We trained our models on one server with a single NVIDIA GTX 3090 GPU. |
| Software Dependencies | No | The paper mentions optimizers (Adam) and models (BERT, GAP) but does not provide specific version numbers for software dependencies like Python, PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | Yes | We follow the original hyperparameters of RATSQL [27] that uses batch size 20, initial learning rate 7 × 10^−4, max steps 40,000 and the Adam optimizer [16]. For BERT, the initial learning rate is adjusted to 2 × 10^−4, and the max training step is increased to 90,000. We also apply a separate learning rate of 3 × 10^−6 to fine-tune BERT. For GAP, we follow the original settings in Shi et al. [25]. In addition, we stack 3-layer SADGA followed by 4-layer RAT. |