reproducibilityindex.ai

Training Naturalized Semantic Parsers with Very Little Data

Authors: Subendhu Rongali, Konstantine Arkoudas, Melanie Rubino, Wael Hamza

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that this method delivers new SOTA few-shot performance on the Overnight dataset, particularly in very low-resource settings, and very compelling few-shot results on a new semantic parsing dataset.
Researcher Affiliation	Collaboration	Subendhu Rongali1,2 , Konstantine Arkoudas2 , Melanie Rubino2 , Wael Hamza2 1University of Massachusetts Amherst 2Amazon Alexa AI, New York srongali@cs.umass.edu, {arkoudk, rubinome, waelhamz}@amazon.com
Pseudocode	No	No pseudocode or algorithm block is explicitly provided. Figure 1 is a diagram illustrating the joint training process, not pseudocode.
Open Source Code	Yes	Additional details, including the pizza canonicalization scheme, are provided in the appendix on our project page2, along with our data ﬁles. 2https://github.com/amazon-research/resource-constrained-naturalized-semantic-parsing
Open Datasets	Yes	Pizza is a recently introduced dataset consisting of English utterances that represent orders of pizzas and drinks. 1https://github.com/amazon-research/pizza-semantic-parsingdataset
Dataset Splits	No	The paper mentions using a 'dev set to choose example for low-resource training' for the Pizza dataset, but does not provide specific train/validation/test splits (percentages or counts) for reproducibility across all datasets used.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, memory, or specific cloud instances with specs) are mentioned for the experimental setup.
Software Dependencies	No	The paper mentions using 'BART-Large' and 'Adam optimizer', but does not provide specific version numbers for software libraries or frameworks (e.g., PyTorch, TensorFlow, specific Python version) used to implement or run the models.
Experiment Setup	Yes	We train all our models with sequence cross entropy loss using the Adam optimizer with β1 = 0.9, β2 = 0.98, ϵ = 1e 9 and the Noam LR scheduler with 500 warmup steps and a learning rate scale factor of 0.15. JT models are trained for 10 epochs, while base models are trained for 100 to 1000 epochs on the low-resource data. We ﬁx the batch size to 512 tokens for all models. We use dropout of 0.1 and freeze the encoder token and position embeddings during training.