Structured Case-Based Reasoning for Inference-Time Adaptation of Text-to-SQL Parsers
Authors: Abhijeet Awasthi, Soumen Chakrabarti, Sunita Sarawagi
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Struct CBR for adapting a Text-to-SQL model to five different target schemas without finetuning. The target schemas are chosen from varying domains. We compare Struct CBR with prior inference-time adaptation methods discussed in §4, and present an ablation study. We also show that Struct CBR enables much faster adaptation of Textto-SQL models in comparison to finetuning. |
| Researcher Affiliation | Academia | Department of Computer Science and Engineering Indian Institute of Technology Bombay, Mumbai, India {awasthi,soumen,sunita}@cse.iitb.ac.in |
| Pseudocode | Yes | Algorithm 1 presents the high-level pseudo code of Struct CBR with additions to the Sm Bo P model. |
| Open Source Code | Yes | Code: https://github.com/awasthiabhijeet/structcbr |
| Open Datasets | Yes | Datasets: We utilize Spider (Yu et al. 2018), which is a collection of Text-to-SQL examples covering 200 unique schemas. We use the train split of Spider as Dtrain, for training all the models. |
| Dataset Splits | Yes | The remaining part of the dev set containing 576 examples is used for model selection while training on Dtrain. |
| Hardware Specification | No | The paper mentions "Due to limited computing resources" but does not provide specific hardware details such as GPU/CPU models or memory. |
| Software Dependencies | No | The paper mentions using "Allen NLP (Gardner et al. 2018) and Transformers (Wolf et al. 2020) libraries" and initializing the text encoder with "ROBERTA-BASE checkpoint," but it does not specify version numbers for these software components. |
| Experiment Setup | Yes | All other hyper-parameters are the set to their default values. The Sm Bo P model is trained on Dtrain for 60K steps with a batch size of 80, using the default learning rate (LR) of 1.86 × 10−4. The GTM baseline utilizes the output of this model for memory look-ups. For Concat CBR baseline we train the Sm Bo P model further for 60K steps with a LR of 5 × 10−5, while concatenating the retrieved cases in the encoder’s input. Struct CBR introduces 2.53% additional parameters (ϕ) over the Sm Bo P parameters (θ). We train the parameters ϕ on Dtrain using a batch size of 64 for 60K steps with the default LR of 1.86 × 10−4. Additional training details are provided in Appendix. |