A Retrieve-and-Edit Framework for Predicting Structured Outputs

Authors: Tatsunori B. Hashimoto, Kelvin Guu, Yonatan Oren, Percy S. Liang

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that on a new autocomplete task for Git Hub Python code and the Hearthstone cards benchmark, retrieve-and-edit significantly boosts the performance of a vanilla sequence-to-sequence model on both tasks.
Researcher Affiliation Academia Tatsunori B. Hashimoto Department of Computer Science Stanford University thashim@stanford.edu Kelvin Guu Department of Statistics Stanford University kguu@stanford.edu Yonatan Oren Department of Computer Science Stanford University yonatano@stanford.edu Percy Liang Department of Computer Science Stanford University pliang@cs.stanford.edu
Pseudocode No The paper describes the overall procedure in text (Section 3.1.4) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Reproducibility. Data and code used to generate the results of this paper are available on the Coda Lab Worksheets platform at https://worksheets.codalab.org/worksheets/ 0x1ad3f387005c492ea913cf0f20c9bb89/.
Open Datasets Yes Our Python autocomplete dataset is a representative sample of Python code from Git Hub, obtained from Google Bigquery by retrieving Python code containing at least one block comment with restructured text (re ST) formatting (See Appendix C for details). ... The Hearthstone cards benchmark consists of 533 cards in a computer card game, where each card is associated with a code snippet. The Hearthstone cards benchmark [22]
Dataset Splits Yes We also removed any duplicate function/docstring pairs and split the train and test set at the repository level. ... obtained by evaluating BLEU scores on the development set of both datasets.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow).
Experiment Setup Yes Both the retriever and editor were trained for 1000 iterations on Hearthstone and 3000 on Git Hub via ADAM minibatch gradient descent, with batch size 16 and a learning rate of 0.001.