Towards Neural Programming Interfaces
Authors: Zachary Brown, Nathaniel Robinson, David Wingate, Nancy Fulda
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments against other state-of-the-art approaches, we demonstrate the efficacy of our methods using Open AI s GPT-2 model, successfully controlling noun selection, topic aversion, offensive speech filtering, and other aspects of language while largely maintaining the controlled model s fluency under deterministic settings. |
| Researcher Affiliation | Academia | Zachary C. Brown Electrical and Computer Engineering Duke University Durham, NC 27708 zac.brown@duke.edu; Nathaniel Robinson Department of Computer Science Brigham Young University Provo, UT 84602 nrobinson@byu.edu; David Wingate Department of Computer Science Brigham Young University Provo, UT 84602 wingated@cs.byu.edu; Nancy Fulda Department of Computer Science Brigham Young University Provo, UT 84602 nfulda@cs.byu.edu |
| Pseudocode | Yes | please see Algorithm 1 of Appendix 1 as well as Figure 1 of this paper for further illustration. |
| Open Source Code | Yes | Code available at https://github.com/DRAGNLabs/towards-neural-programming-interfaces |
| Open Datasets | Yes | We generated our data sets by performing hundreds of thousands of GPT-2 forward passes using input text extracted from a Wikipedia corpus [27], Reddit corpus [28], and Toronto Book Corpus [29]. |
| Dataset Splits | No | The paper mentions training data sizes and evaluation datasets but does not explicitly describe a separate validation set or its specific split for hyperparameter tuning or early stopping. |
| Hardware Specification | No | The paper mentions using a "small-scale GPT-2 model" and notes "limited computational capacity" for experiments, but it does not specify any exact hardware components like GPU or CPU models, or cloud computing instance types. |
| Software Dependencies | No | The paper mentions using Open AI's GPT-2 model and that "Much of our code was adapted from the Hugging Face Transformers Git Hub repository [23]", but it does not provide specific version numbers for any software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | See Sections 5-6 of Appendix 1 for further details as to experiments and hyperparameters... Due to limited computational capacity, our experiments were performed using a small GPT-2 model and a short context length of w {10, 15} characters. If this assumption can indeed be relaxed, we may see further improvements in the fluency of controlled outputs than those shown in our experiments in Section 4, which use a baseline GPT-2 model that has the same filter settings as those used to generate our NPI data sets. |