Learning to Recombine and Resample Data For Compositional Generalization

Authors: Ekin Akyürek, Afra Feyza Akyürek, Jacob Andreas

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate R&R on two tests of compositional generalization: the SCAN instruction following task (Lake & Baroni, 2018) and a few-shot morphology learning task derived from the SIGMORPHON 2018 dataset (Kirov et al., 2018; Cotterell et al., 2018). Our experiments are designed to explore the effectiveness of learned data recombination procedures in controlled and natural settings.
Researcher Affiliation Academia Ekin Akyürek MIT CSAIL akyurek@mit.edu Afra Feyza Akyürek Boston University akyurek@bu.edu Jacob Andreas MIT CSAIL jda@mit.edu
Pseudocode No The paper does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code Yes Code for all experiments in this paper is available at https://github.com/ekinakyurek/compgen.
Open Datasets Yes We evaluate R&R on two tests of compositional generalization: the SCAN instruction following task (Lake & Baroni, 2018) and a few-shot morphology learning task derived from the SIGMORPHON 2018 dataset (Kirov et al., 2018; Cotterell et al., 2018).
Dataset Splits Yes We construct splits of the data featuring a training set of 1000 examples and three test sets of 100 examples. ... we construct five different splits per language and use the Spanish past-tense data as a development set.
Hardware Specification Yes We use a single 32GB NVIDIA V100 Volta GPU for each experiment.
Software Dependencies Yes We implemented our experiments in Knet (Yuret, 2016) using Julia (Bezanson et al., 2017).
Experiment Setup Yes Morphology: The hidden and embedding sizes are 1024. No dropout is applied. ... SCAN: We choose the hidden size as 512, and embedding size as 64. We apply 0.5 dropout to the input.