Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Latent Programmer: Discrete Latent Codes for Program Synthesis
Authors: Joey Hong, David Dohan, Rishabh Singh, Charles Sutton, Manzil Zaheer
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the LP on two domains, demonstrating that it yields an improvement in accuracy, especially on longer programs for which search is most difficult. |
| Researcher Affiliation | Industry | 1Google Research, Mountain View, CA, USA. |
| Pseudocode | Yes | Algorithm 1 Program synthesis using two-level search |
| Open Source Code | No | The paper does not provide a direct statement about open-sourcing the code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | The dataset used consists of 111K python examples, which consist of a docstring and corresponding code snippet, collected from Github (Wan et al., 2018). |
| Dataset Splits | No | The paper mentions training on 'roughly 25M tasks' and evaluating on '1K held-out ones' but does not explicitly specify a validation set or its size/percentage for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running the experiments. |
| Software Dependencies | No | The paper mentions using Python for code generation and refers to Transformers, but does not provide specific version numbers for software dependencies or libraries. |
| Experiment Setup | Yes | All models have an embedding size of 128 and hidden size of 512, and the attention layers consist of 3 stacked layers with 4 heads each. For the LP model, we used a latent compression factor ℓ= 2 and vocabulary size K = 40. |