Latent Programmer: Discrete Latent Codes for Program Synthesis
Authors: Joey Hong, David Dohan, Rishabh Singh, Charles Sutton, Manzil Zaheer
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the LP on two domains, demonstrating that it yields an improvement in accuracy, especially on longer programs for which search is most difficult. |
| Researcher Affiliation | Industry | 1Google Research, Mountain View, CA, USA. |
| Pseudocode | Yes | Algorithm 1 Program synthesis using two-level search |
| Open Source Code | No | The paper does not provide a direct statement about open-sourcing the code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | The dataset used consists of 111K python examples, which consist of a docstring and corresponding code snippet, collected from Github (Wan et al., 2018). |
| Dataset Splits | No | The paper mentions training on 'roughly 25M tasks' and evaluating on '1K held-out ones' but does not explicitly specify a validation set or its size/percentage for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running the experiments. |
| Software Dependencies | No | The paper mentions using Python for code generation and refers to Transformers, but does not provide specific version numbers for software dependencies or libraries. |
| Experiment Setup | Yes | All models have an embedding size of 128 and hidden size of 512, and the attention layers consist of 3 stacked layers with 4 heads each. For the LP model, we used a latent compression factor ℓ= 2 and vocabulary size K = 40. |