Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages

Authors: Xinyun Chen, Dawn Song, Yuandong Tian

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental When evaluating on whether the program execution outputs match the IO pairs, La Synth achieves 55.2% accuracy on generating simple C code with tens of tokens including loops and branches, outperforming existing approaches without executors by around 20%.
Researcher Affiliation Collaboration Xinyun Chen UC Berkeley xinyun.chen@berkeley.edu Dawn Song UC Berkeley dawnsong@cs.berkeley.edu Yuandong Tian Facebook AI Research yuandong@fb.com
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes 1The code is available at https://github.com/Jungyhuk/latent-execution.
Open Datasets Yes We train and evaluate all models on the Karel dataset introduced in [9]. The dataset contains randomly sampled programs from the Karel DSL (1.1M training samples, 2.5K samples in the validation set and 2.5K samples in the test set). ... Our full dataset includes 500K samples in the training set, 1K samples in the validation set, and 1K samples in the test set.
Dataset Splits Yes Our full dataset includes 500K samples in the training set, 1K samples in the validation set, and 1K samples in the test set.
Hardware Specification No The paper does not explicitly describe the hardware used for experiments (e.g., specific GPU/CPU models, memory, cloud instances).
Software Dependencies No The paper mentions using Csmith, a random C code generation tool, but does not provide specific version numbers for Csmith or any other software dependencies used for the experiments or model implementation.
Experiment Setup No The paper does not explicitly provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations for reproducibility. It describes model components and training losses but lacks the granular setup details.