Just Add Functions: A Neural-Symbolic Language Model
Authors: David Demeter, Doug Downey7634-7642
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We explore the effectiveness of this approach on numbers and geographic locations, and show that NSLMs significantly reduce perplexity in small-corpus language modeling, and that the performance improvement persists for rare tokens even on much larger corpora. ... Our primary experimental results are shown in Tables 2 and 3. |
| Researcher Affiliation | Collaboration | David Demeter Northwestern University Evanston, IL, USA ddemeter@u.northwestern.edu Doug Downey Allen Institute for Artificial Intelligence Seattle, WA, USA dougd@allenai.org |
| Pseudocode | No | Table 1 is titled 'NSLM Construction Algorithm' and lists general steps, but it is not presented in a structured pseudocode format with programming constructs like variables, loops, or conditional logic. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing its source code, nor does it include links to a code repository or mention code availability in supplementary materials. |
| Open Datasets | Yes | To evaluate numbers on the Wikitext corpora... To evaluate geographic locations on the Wikitext corpora, multi-word named entities appearing in the Geonames data set (Geo Names 2019) are chunked together to form single tokens. |
| Dataset Splits | Yes | Step size, ensembling factor λCache and temperature θCache were set to 500, 0.25 and 0.75, respectively, after tuning on the validation set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU types, or cloud computing instance specifications, beyond general mentions of training models. |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies, libraries, or frameworks used in the implementation or experimentation. |
| Experiment Setup | Yes | Thus, we adopt a standard language model architecture as our primary baseline, an RNN with LSTM cells and hyper-parameters corresponding to medium 650 dimensional models (Zaremba, Sutskever, and Vinyals 2014). ... During training, the softmax is computed using the full vocabulary, except for the Wikitext-103 model which uses a sampled-softmax (Jean et al. 2015) with a sampling rate of 2,500. ... Step size, ensembling factor λCache and temperature θCache were set to 500, 0.25 and 0.75, respectively, after tuning on the validation set. |