Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages
Authors: Xinyun Chen, Dawn Song, Yuandong Tian
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | When evaluating on whether the program execution outputs match the IO pairs, La Synth achieves 55.2% accuracy on generating simple C code with tens of tokens including loops and branches, outperforming existing approaches without executors by around 20%. |
| Researcher Affiliation | Collaboration | Xinyun Chen UC Berkeley EMAIL Dawn Song UC Berkeley EMAIL Yuandong Tian Facebook AI Research EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1The code is available at https://github.com/Jungyhuk/latent-execution. |
| Open Datasets | Yes | We train and evaluate all models on the Karel dataset introduced in [9]. The dataset contains randomly sampled programs from the Karel DSL (1.1M training samples, 2.5K samples in the validation set and 2.5K samples in the test set). ... Our full dataset includes 500K samples in the training set, 1K samples in the validation set, and 1K samples in the test set. |
| Dataset Splits | Yes | Our full dataset includes 500K samples in the training set, 1K samples in the validation set, and 1K samples in the test set. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for experiments (e.g., specific GPU/CPU models, memory, cloud instances). |
| Software Dependencies | No | The paper mentions using Csmith, a random C code generation tool, but does not provide specific version numbers for Csmith or any other software dependencies used for the experiments or model implementation. |
| Experiment Setup | No | The paper does not explicitly provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations for reproducibility. It describes model components and training losses but lacks the granular setup details. |