Conditional Tree Matching for Inference-Time Adaptation of Tree Prediction Models
Authors: Harshit Varma, Abhijeet Awasthi, Sunita Sarawagi
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now present an empirical evaluation of CTREEOT both in terms of the quality of our proposed conditional tree matching score CTS and running time. We evaluate the quality of CTS by deploying it for inference-time adaptation of a real-life task of converting text utterances to SQL represented as an abstract relational tree. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, Indian Institute of Technology (IIT) Bombay. Correspondence to: Harshit Varma, Sunita Sarawagi <{harshitvarma, sunita}@cse.iitb.ac.in>. |
| Pseudocode | Yes | Algorithm 1 Tensorized CTREEOT |
| Open Source Code | Yes | The code for CTREEOT has been open-sourced1. 1https://github.com/hrshtv/CTreeOT |
| Open Datasets | Yes | Datasets We adapt a Text-to-SQL model to five different target schemas from the SPIDER dataset (Yu et al., 2018) without finetuning. |
| Dataset Splits | Yes | For training, we use SPIDER s train split containing 7000 Text-to-SQL examples from 140 schemas. For evaluation, we follow Awasthi et al. (2023) and use examples from the following five schemas from SPIDER s development set: {world 1, car 1, cre Doc Template Mgt, dog kennels, flight 2}. Examples from these schemas are excluded from the training and validation splits. The remaining 576 examples from the SPIDER s development set are used for validation. |
| Hardware Specification | Yes | These experiments were performed on a single NVIDIA RTX A6000 GPU and the algorithms were implemented in Py Torch. |
| Software Dependencies | No | These experiments were performed on a single NVIDIA RTX A6000 GPU and the algorithms were implemented in Py Torch. |
| Experiment Setup | Yes | We use ϵ = 10 3 and λ = 1. The beam is of size 30 and serves as our set of candidate trees Yx. Our relevance transformer consists of four transformer blocks with a fully-connected layer at the end to predict the scores. A single block is a stack of self-attention (8 heads), feedforward, and layer normalization layers. We keep the batch size as a multiple of the number of cases, and design the batches such that for a candidate tree, the remaining |C| 1 examples are from the same schema and act as cases. Our relevance transformer achieves an average F1 score of 77.1 on the validation split after being trained for 75 epochs. |