Structured Case-Based Reasoning for Inference-Time Adaptation of Text-to-SQL Parsers

Authors: Abhijeet Awasthi, Soumen Chakrabarti, Sunita Sarawagi

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Struct CBR for adapting a Text-to-SQL model to five different target schemas without finetuning. The target schemas are chosen from varying domains. We compare Struct CBR with prior inference-time adaptation methods discussed in §4, and present an ablation study. We also show that Struct CBR enables much faster adaptation of Textto-SQL models in comparison to finetuning.
Researcher Affiliation Academia Department of Computer Science and Engineering Indian Institute of Technology Bombay, Mumbai, India {awasthi,soumen,sunita}@cse.iitb.ac.in
Pseudocode Yes Algorithm 1 presents the high-level pseudo code of Struct CBR with additions to the Sm Bo P model.
Open Source Code Yes Code: https://github.com/awasthiabhijeet/structcbr
Open Datasets Yes Datasets: We utilize Spider (Yu et al. 2018), which is a collection of Text-to-SQL examples covering 200 unique schemas. We use the train split of Spider as Dtrain, for training all the models.
Dataset Splits Yes The remaining part of the dev set containing 576 examples is used for model selection while training on Dtrain.
Hardware Specification No The paper mentions "Due to limited computing resources" but does not provide specific hardware details such as GPU/CPU models or memory.
Software Dependencies No The paper mentions using "Allen NLP (Gardner et al. 2018) and Transformers (Wolf et al. 2020) libraries" and initializing the text encoder with "ROBERTA-BASE checkpoint," but it does not specify version numbers for these software components.
Experiment Setup Yes All other hyper-parameters are the set to their default values. The Sm Bo P model is trained on Dtrain for 60K steps with a batch size of 80, using the default learning rate (LR) of 1.86 × 10−4. The GTM baseline utilizes the output of this model for memory look-ups. For Concat CBR baseline we train the Sm Bo P model further for 60K steps with a LR of 5 × 10−5, while concatenating the retrieved cases in the encoder’s input. Struct CBR introduces 2.53% additional parameters (ϕ) over the Sm Bo P parameters (θ). We train the parameters ϕ on Dtrain using a batch size of 64 for 60K steps with the default LR of 1.86 × 10−4. Additional training details are provided in Appendix.