Towards Neural Programming Interfaces

Authors: Zachary Brown, Nathaniel Robinson, David Wingate, Nancy Fulda

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments against other state-of-the-art approaches, we demonstrate the efficacy of our methods using Open AI s GPT-2 model, successfully controlling noun selection, topic aversion, offensive speech filtering, and other aspects of language while largely maintaining the controlled model s fluency under deterministic settings.
Researcher Affiliation Academia Zachary C. Brown Electrical and Computer Engineering Duke University Durham, NC 27708 zac.brown@duke.edu; Nathaniel Robinson Department of Computer Science Brigham Young University Provo, UT 84602 nrobinson@byu.edu; David Wingate Department of Computer Science Brigham Young University Provo, UT 84602 wingated@cs.byu.edu; Nancy Fulda Department of Computer Science Brigham Young University Provo, UT 84602 nfulda@cs.byu.edu
Pseudocode Yes please see Algorithm 1 of Appendix 1 as well as Figure 1 of this paper for further illustration.
Open Source Code Yes Code available at https://github.com/DRAGNLabs/towards-neural-programming-interfaces
Open Datasets Yes We generated our data sets by performing hundreds of thousands of GPT-2 forward passes using input text extracted from a Wikipedia corpus [27], Reddit corpus [28], and Toronto Book Corpus [29].
Dataset Splits No The paper mentions training data sizes and evaluation datasets but does not explicitly describe a separate validation set or its specific split for hyperparameter tuning or early stopping.
Hardware Specification No The paper mentions using a "small-scale GPT-2 model" and notes "limited computational capacity" for experiments, but it does not specify any exact hardware components like GPU or CPU models, or cloud computing instance types.
Software Dependencies No The paper mentions using Open AI's GPT-2 model and that "Much of our code was adapted from the Hugging Face Transformers Git Hub repository [23]", but it does not provide specific version numbers for any software libraries or dependencies used in the experiments.
Experiment Setup Yes See Sections 5-6 of Appendix 1 for further details as to experiments and hyperparameters... Due to limited computational capacity, our experiments were performed using a small GPT-2 model and a short context length of w {10, 15} characters. If this assumption can indeed be relaxed, we may see further improvements in the fluency of controlled outputs than those shown in our experiments in Section 4, which use a baseline GPT-2 model that has the same filter settings as those used to generate our NPI data sets.