Zero-Shot Text-to-SQL Learning with Auxiliary Task
Authors: Shuaichen Chang, Pengfei Liu, Yun Tang, Jing Huang, Xiaodong He, Bowen Zhou7488-7495
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, We evaluate our models on a large text-to-SQL dataset Wiki SQL. Compared to a strong baseline coarse-tofine model, our models improve over the baseline by more than 3% absolute in accuracy on the whole dataset. More interestingly, on a zero-shot subset test of Wiki SQL, our models achieve 5% absolute accuracy gain over the baseline, clearly demonstrating its superior generalizability. |
| Researcher Affiliation | Collaboration | Shuaichen Chang,1 Pengfei Liu,2 Yun Tang,3 Jing Huang,3 Xiaodong He,3 Bowen Zhou3 1The Ohio State University, 2Fudan University, 3JD.COM AI Research chang.1692@osu.edu, pfliu14@fudan.edu.cn, {yun.tang, jing.huang, xiaodong.he, bowen.zhou}@jd.com |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are provided. The paper includes mathematical formulations for CLS and PT functions, and a 'SQL Sketch' figure, but not structured pseudocode. |
| Open Source Code | Yes | 1Our code can be found in https://github.com/JD-AI-Research Silicon-Valley/auxiliary-task-for-text-to-sql |
| Open Datasets | Yes | Wiki SQL has over 20K tables and 80K questions corresponding to these tables. This dataset was designed for translating natural language questions to SQL queries using the corresponding table columns without access to the table content. |
| Dataset Splits | Yes | We split the test set based on the number of shots (the number of a table occurrences in training data). |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) are provided in the paper. |
| Software Dependencies | No | The paper mentions using "300-dim Glove word embedding" and "Bi LSTM sentence encoder" but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | We use 300-dim Glove word embedding as our pre-trained embedding. Hidden size for all LSTM is 250 and hidden size in attention function is set to 64. The loss weight λ is set to 0.5. A 0.5-rate dropout layer is used before each output layer. Each concatenation is followed by one full-connected layer to reduce the dimension to the original hidden or attention size. Test model is selected by the best performing model on validation set. |