Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning to Recombine and Resample Data For Compositional Generalization

Authors: Ekin Akyürek, Afra Feyza Akyürek, Jacob Andreas

ICLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate R&R on two tests of compositional generalization: the SCAN instruction following task (Lake & Baroni, 2018) and a few-shot morphology learning task derived from the SIGMORPHON 2018 dataset (Kirov et al., 2018; Cotterell et al., 2018). Our experiments are designed to explore the effectiveness of learned data recombination procedures in controlled and natural settings.
Researcher Affiliation	Academia	Ekin Akyürek MIT CSAIL EMAIL Afra Feyza Akyürek Boston University EMAIL Jacob Andreas MIT CSAIL EMAIL
Pseudocode	No	The paper does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	Yes	Code for all experiments in this paper is available at https://github.com/ekinakyurek/compgen.
Open Datasets	Yes	We evaluate R&R on two tests of compositional generalization: the SCAN instruction following task (Lake & Baroni, 2018) and a few-shot morphology learning task derived from the SIGMORPHON 2018 dataset (Kirov et al., 2018; Cotterell et al., 2018).
Dataset Splits	Yes	We construct splits of the data featuring a training set of 1000 examples and three test sets of 100 examples. ... we construct ﬁve different splits per language and use the Spanish past-tense data as a development set.
Hardware Specification	Yes	We use a single 32GB NVIDIA V100 Volta GPU for each experiment.
Software Dependencies	Yes	We implemented our experiments in Knet (Yuret, 2016) using Julia (Bezanson et al., 2017).
Experiment Setup	Yes	Morphology: The hidden and embedding sizes are 1024. No dropout is applied. ... SCAN: We choose the hidden size as 512, and embedding size as 64. We apply 0.5 dropout to the input.