reproducibilityindex.ai

Zero-Shot Text-to-SQL Learning with Auxiliary Task

Authors: Shuaichen Chang, Pengfei Liu, Yun Tang, Jing Huang, Xiaodong He, Bowen Zhou7488-7495

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, We evaluate our models on a large text-to-SQL dataset Wiki SQL. Compared to a strong baseline coarse-toﬁne model, our models improve over the baseline by more than 3% absolute in accuracy on the whole dataset. More interestingly, on a zero-shot subset test of Wiki SQL, our models achieve 5% absolute accuracy gain over the baseline, clearly demonstrating its superior generalizability.
Researcher Affiliation	Collaboration	Shuaichen Chang,1 Pengfei Liu,2 Yun Tang,3 Jing Huang,3 Xiaodong He,3 Bowen Zhou3 1The Ohio State University, 2Fudan University, 3JD.COM AI Research chang.1692@osu.edu, pﬂiu14@fudan.edu.cn, {yun.tang, jing.huang, xiaodong.he, bowen.zhou}@jd.com
Pseudocode	No	No explicit pseudocode or algorithm blocks are provided. The paper includes mathematical formulations for CLS and PT functions, and a 'SQL Sketch' figure, but not structured pseudocode.
Open Source Code	Yes	1Our code can be found in https://github.com/JD-AI-Research Silicon-Valley/auxiliary-task-for-text-to-sql
Open Datasets	Yes	Wiki SQL has over 20K tables and 80K questions corresponding to these tables. This dataset was designed for translating natural language questions to SQL queries using the corresponding table columns without access to the table content.
Dataset Splits	Yes	We split the test set based on the number of shots (the number of a table occurrences in training data).
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) are provided in the paper.
Software Dependencies	No	The paper mentions using "300-dim Glove word embedding" and "Bi LSTM sentence encoder" but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	We use 300-dim Glove word embedding as our pre-trained embedding. Hidden size for all LSTM is 250 and hidden size in attention function is set to 64. The loss weight λ is set to 0.5. A 0.5-rate dropout layer is used before each output layer. Each concatenation is followed by one full-connected layer to reduce the dimension to the original hidden or attention size. Test model is selected by the best performing model on validation set.