Towards Synthesizing Complex Programs From Input-Output Examples
Authors: Xinyun Chen, Chang Liu, Dawn Song
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our evaluation, we show that using our novel approach, neural parsing programs can be learned to achieve 100% test accuracy on test inputs that are 500 longer than the training samples. |
| Researcher Affiliation | Academia | Xinyun Chen Chang Liu Dawn Song University of California, Berkeley |
| Pseudocode | Yes | Algorithm 1 A sketch of the two-phase reinforcement learning algorithm |
| Open Source Code | No | The paper does not provide a statement or link indicating the open-source availability of their described methodology's code. |
| Open Datasets | No | The paper describes generating its own training datasets (e.g., 'Curriculum', 'Std-10', 'Std-50') but does not provide specific access information like links, DOIs, or citations to publicly available versions of these datasets. |
| Dataset Splits | No | The paper describes training and test sets but does not explicitly define a validation set or provide details on its split for model tuning. |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU models, CPU types, or memory used for experiments. |
| Software Dependencies | No | The paper mentions optimizers like 'Adam optimizer' and 'RMSProp optimizer' but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For the LL machines, F = 10. About the capacity of each stack frame K, K = 3 for WHILE language, and K = 4 for LAMBDA language. In the architecture of the neural parsing program, each LSTM has 1 layer, with its hidden state size D = 50, which is the same as the embedding size. As for the training, learning rate is η = 0.01 with no decay. No dropout is used. Gradient weights for the three components Θ1, Θ2 and Θ3 are γ1 = 10.0, γ2 = 1.0, and γ3 = 0.01 respectively. Gradients with L2 norm larger than 5.0 are scaled down to have the norm of 5.0. The model is trained using Adam optimizer. All weights are initialized uniformly randomly in [ 0.1, 0.1]. The mini-batch size is 1. For candidate trace search, σ = 0.1, M1 = 20, M2 = 10, 000, and M3 = 2, 000. |