A Retrieve-and-Edit Framework for Predicting Structured Outputs
Authors: Tatsunori B. Hashimoto, Kelvin Guu, Yonatan Oren, Percy S. Liang
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that on a new autocomplete task for Git Hub Python code and the Hearthstone cards benchmark, retrieve-and-edit significantly boosts the performance of a vanilla sequence-to-sequence model on both tasks. |
| Researcher Affiliation | Academia | Tatsunori B. Hashimoto Department of Computer Science Stanford University thashim@stanford.edu Kelvin Guu Department of Statistics Stanford University kguu@stanford.edu Yonatan Oren Department of Computer Science Stanford University yonatano@stanford.edu Percy Liang Department of Computer Science Stanford University pliang@cs.stanford.edu |
| Pseudocode | No | The paper describes the overall procedure in text (Section 3.1.4) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Reproducibility. Data and code used to generate the results of this paper are available on the Coda Lab Worksheets platform at https://worksheets.codalab.org/worksheets/ 0x1ad3f387005c492ea913cf0f20c9bb89/. |
| Open Datasets | Yes | Our Python autocomplete dataset is a representative sample of Python code from Git Hub, obtained from Google Bigquery by retrieving Python code containing at least one block comment with restructured text (re ST) formatting (See Appendix C for details). ... The Hearthstone cards benchmark consists of 533 cards in a computer card game, where each card is associated with a code snippet. The Hearthstone cards benchmark [22] |
| Dataset Splits | Yes | We also removed any duplicate function/docstring pairs and split the train and test set at the repository level. ... obtained by evaluating BLEU scores on the development set of both datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow). |
| Experiment Setup | Yes | Both the retriever and editor were trained for 1000 iterations on Hearthstone and 3000 on Git Hub via ADAM minibatch gradient descent, with batch size 16 and a learning rate of 0.001. |