reproducibilityindex.ai

Towards Synthesizing Complex Programs From Input-Output Examples

Authors: Xinyun Chen, Chang Liu, Dawn Song

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our evaluation, we show that using our novel approach, neural parsing programs can be learned to achieve 100% test accuracy on test inputs that are 500 longer than the training samples.
Researcher Affiliation	Academia	Xinyun Chen Chang Liu Dawn Song University of California, Berkeley
Pseudocode	Yes	Algorithm 1 A sketch of the two-phase reinforcement learning algorithm
Open Source Code	No	The paper does not provide a statement or link indicating the open-source availability of their described methodology's code.
Open Datasets	No	The paper describes generating its own training datasets (e.g., 'Curriculum', 'Std-10', 'Std-50') but does not provide specific access information like links, DOIs, or citations to publicly available versions of these datasets.
Dataset Splits	No	The paper describes training and test sets but does not explicitly define a validation set or provide details on its split for model tuning.
Hardware Specification	No	The paper does not specify any hardware details such as GPU models, CPU types, or memory used for experiments.
Software Dependencies	No	The paper mentions optimizers like 'Adam optimizer' and 'RMSProp optimizer' but does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	For the LL machines, F = 10. About the capacity of each stack frame K, K = 3 for WHILE language, and K = 4 for LAMBDA language. In the architecture of the neural parsing program, each LSTM has 1 layer, with its hidden state size D = 50, which is the same as the embedding size. As for the training, learning rate is η = 0.01 with no decay. No dropout is used. Gradient weights for the three components Θ1, Θ2 and Θ3 are γ1 = 10.0, γ2 = 1.0, and γ3 = 0.01 respectively. Gradients with L2 norm larger than 5.0 are scaled down to have the norm of 5.0. The model is trained using Adam optimizer. All weights are initialized uniformly randomly in [ 0.1, 0.1]. The mini-batch size is 1. For candidate trace search, σ = 0.1, M1 = 20, M2 = 10, 000, and M3 = 2, 000.