Leveraging Table Content for Zero-shot Text-to-SQL with Meta-Learning
Authors: Yongrui Chen, Xinnan Guo, Chaojie Wang, Jian Qiu, Guilin Qi, Meng Wang, Huiying Li3992-4000
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on a public open-domain text-to-SQL dataset Wiki SQL and a domain-specific dataset ESQL. Compared to existing approaches using the same pre-trained model, our approach achieves significant improvements on both datasets. Compared to the larger pre-trained model and the tabular-specific pre-trained model, our approach is still competitive. More importantly, on the zero-shot subsets of both the datasets, our approach further increases the improvements. |
| Researcher Affiliation | Collaboration | 1School of Computer Science and Engineering, Southeast University, Nanjing, China 2Alibaba Group |
| Pseudocode | Yes | Algorithm 1 Zero-Shot Meta-Learning Framework |
| Open Source Code | Yes | Due to commercial secrets, we first desensitize the original dataset and then release it and all the codes of MC-SQL on https://github.com/qjay612/meta learning NL2SQL, and all the results in this paper are obtained from the desensitized version. |
| Open Datasets | Yes | Wiki SQL (Zhong, Xiong, and Socher 2017) is an English open-domain text-to-SQL benchmark... ESQL is a Chinese domain-specific text-to-SQL dataset built by ourself. ...release it and all the codes of MC-SQL on https://github.com/qjay612/meta learning NL2SQL |
| Dataset Splits | Yes | The data set is divided into 56,355 training questions, 8,421 development questions, and 15,878 test questions. ... The dataset is divided into 10,000 training questions, 1,000 development questions, and 2,000 test questions. |
| Hardware Specification | Yes | We perform all the experiments on NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions 'all the BERT models are of base version' but does not specify a precise version number for BERT or any other software dependency. |
| Experiment Setup | Yes | The following hyperparameters are tuned on development sets: (1) Fitlering threshold σ is set to 0.9 for both datasets. (2) The layer number of all Bi LSTMs is set to 2. (3) The hidden state size d is set to 100. (4) The character embedding size de is set to 128. (5) The type embedding size dt is set to 32. (6) The number of sampling tasks is set to 10,000 for Wiki SQL, 2,500 for ESQL. (7) For Wiki SQL, both N and K in the N-way K-shot setting are set to 4. For ESQL, they are set to 1 and 4, respectively. (8) γ in Algorithm 1 is set to 0.3 for Wiki SQL, 0.5 for ESQL. (9) For α in Algorithm 1, BERT and sub-modules are trained with two kinds respectively. Specifically, αBERT is set to 1 10 5 and αsub is set to 1 10 3. Similarly, βBERT is set to 1 10 5 and βsub is set to 1 10 3. |