reproducibilityindex.ai

Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages

Authors: Xinyun Chen, Dawn Song, Yuandong Tian

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	When evaluating on whether the program execution outputs match the IO pairs, La Synth achieves 55.2% accuracy on generating simple C code with tens of tokens including loops and branches, outperforming existing approaches without executors by around 20%.
Researcher Affiliation	Collaboration	Xinyun Chen UC Berkeley xinyun.chen@berkeley.edu Dawn Song UC Berkeley dawnsong@cs.berkeley.edu Yuandong Tian Facebook AI Research yuandong@fb.com
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	1The code is available at https://github.com/Jungyhuk/latent-execution.
Open Datasets	Yes	We train and evaluate all models on the Karel dataset introduced in [9]. The dataset contains randomly sampled programs from the Karel DSL (1.1M training samples, 2.5K samples in the validation set and 2.5K samples in the test set). ... Our full dataset includes 500K samples in the training set, 1K samples in the validation set, and 1K samples in the test set.
Dataset Splits	Yes	Our full dataset includes 500K samples in the training set, 1K samples in the validation set, and 1K samples in the test set.
Hardware Specification	No	The paper does not explicitly describe the hardware used for experiments (e.g., specific GPU/CPU models, memory, cloud instances).
Software Dependencies	No	The paper mentions using Csmith, a random C code generation tool, but does not provide specific version numbers for Csmith or any other software dependencies used for the experiments or model implementation.
Experiment Setup	No	The paper does not explicitly provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations for reproducibility. It describes model components and training losses but lacks the granular setup details.