reproducibilityindex.ai

Synthetic Datasets for Neural Program Synthesis

Authors: Richard Shin, Neel Kant, Kavi Gupta, Chris Bender, Brandon Trabucco, Rishabh Singh, Dawn Song

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate, using the Karel DSL and a small Calculator DSL, that training deep networks on these distributions leads to improved cross-distribution generalization performance.
Researcher Affiliation	Collaboration	Richard Shin UC Berkeley Neel Kant UC Berkeley and ML@B Kavi Gupta UC Berkeley Christopher Bender UC Berkeley and ML@B Brandon Trabucco UC Berkeley and ML@B Rishabh Singh Google Brain Dawn Song UC Berkeley
Pseudocode	Yes	For full pseudocode see Section B.1 in the appendix.
Open Source Code	No	The paper mentions 'tensor2tensor (Vaswani et al., 2018), an open-source deep learning library' as a tool they used, but does not state that their own implementation code is open source or provide a link to it.
Open Datasets	No	The paper mentions using a 'provided synthetic training set' from Bunel et al. (2018) and describes generating new synthetic datasets, but does not provide specific access information (link, DOI, repository, or formal citation with authors/year for their generated datasets) to make them publicly available.
Dataset Splits	No	The paper mentions 'existing validation and test sets' and describes generating new training and test sets, but does not provide specific percentage splits or sample counts for the original datasets, nor explicit numerical splits for their newly generated ones to ensure reproducibility of the data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions 'tensor2tensor (Vaswani et al., 2018)' but does not provide specific version numbers for this or any other software dependencies, which would be needed for reproducibility.
Experiment Setup	No	The paper states 'We reproduced the encoder-decoder model of Bunel et al. (2018) and trained it using the provided synthetic training set with the teacher-forcing maximum likelihood objective,' but it does not specify concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed system-level training settings.