reproducibilityindex.ai

MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing

Authors: Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Jian-Guang Lou

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results under three typical settings (zero-shot, monolingual and multilingual) reveal a 6.1% absolute drop in accuracy in non-English languages. Qualitative and quantitative analyses are conducted to understand the reason for the performance drop of each language.
Researcher Affiliation	Collaboration	Longxu Dou1, Yan Gao2, Mingyang Pan1, Dingzirui Wang1, Wanxiang Che1, Dechen Zhan1, Jian-Guang Lou2 1 Harbin Institute of Technology 2 Microsoft Research Asia
Pseudocode	No	The paper describes methods in prose and flowcharts (Figure 4, 5) but does not include formal pseudocode or algorithm blocks.
Open Source Code	Yes	Code available at https://github.com/microsoft/Contextual SP
Open Datasets	Yes	We build MULTISPIDER based on Spider (Yu et al. 2018), a large-scale cross-database text-to-SQL dataset in English. We also collect data from the CSpider (Min and Zhang 2019) and VSpider (Tuan Nguyen, Dao, and Nguyen 2020), which are also free and open text-to SQL dataset.
Dataset Splits	Yes	Only 9691 questions and 5263 SQL queries over 166 databases (train-set and dev-set) are publicly available.
Hardware Specification	No	The paper does not explicitly provide details about the specific hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	No	The paper mentions specific models and frameworks (e.g., m BERT, XLM-Roberta-Large, m BART, RAT-SQL) with citations but does not provide specific version numbers for software dependencies or libraries used in their implementation.
Experiment Setup	Yes	Training with Augmented Data During the training phase, we first adopt the augmented data to warm up the model three epochs to alleviate the noise in augmented data, then fine-tune the model with original high-quality training data.