Training Naturalized Semantic Parsers with Very Little Data
Authors: Subendhu Rongali, Konstantine Arkoudas, Melanie Rubino, Wael Hamza
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that this method delivers new SOTA few-shot performance on the Overnight dataset, particularly in very low-resource settings, and very compelling few-shot results on a new semantic parsing dataset. |
| Researcher Affiliation | Collaboration | Subendhu Rongali1,2 , Konstantine Arkoudas2 , Melanie Rubino2 , Wael Hamza2 1University of Massachusetts Amherst 2Amazon Alexa AI, New York srongali@cs.umass.edu, {arkoudk, rubinome, waelhamz}@amazon.com |
| Pseudocode | No | No pseudocode or algorithm block is explicitly provided. Figure 1 is a diagram illustrating the joint training process, not pseudocode. |
| Open Source Code | Yes | Additional details, including the pizza canonicalization scheme, are provided in the appendix on our project page2, along with our data files. 2https://github.com/amazon-research/resource-constrained-naturalized-semantic-parsing |
| Open Datasets | Yes | Pizza is a recently introduced dataset consisting of English utterances that represent orders of pizzas and drinks. 1https://github.com/amazon-research/pizza-semantic-parsingdataset |
| Dataset Splits | No | The paper mentions using a 'dev set to choose example for low-resource training' for the Pizza dataset, but does not provide specific train/validation/test splits (percentages or counts) for reproducibility across all datasets used. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory, or specific cloud instances with specs) are mentioned for the experimental setup. |
| Software Dependencies | No | The paper mentions using 'BART-Large' and 'Adam optimizer', but does not provide specific version numbers for software libraries or frameworks (e.g., PyTorch, TensorFlow, specific Python version) used to implement or run the models. |
| Experiment Setup | Yes | We train all our models with sequence cross entropy loss using the Adam optimizer with β1 = 0.9, β2 = 0.98, ϵ = 1e 9 and the Noam LR scheduler with 500 warmup steps and a learning rate scale factor of 0.15. JT models are trained for 10 epochs, while base models are trained for 100 to 1000 epochs on the low-resource data. We fix the batch size to 512 tokens for all models. We use dropout of 0.1 and freeze the encoder token and position embeddings during training. |