Compressed Nonparametric Language Modelling
Authors: Ehsan Shareghi, Gholamreza Haffari, Trevor Cohn
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results illustrate that our model can be built on significantly larger datasets compared to previous HPYP models, while being several orders of magnitudes smaller, fast for training and inference, and outperforming the perplexity of the state-of-the-art Modified Kneser-Ney count-based LM smoothing by up to 15%. |
| Researcher Affiliation | Academia | Faculty of Information Technology, Monash University Computing and Information Systems, The University of Melbourne first.last@{monash.edu, unimelb.edu.au} |
| Pseudocode | Yes | Algorithm 1 Gibbs Sampler for η γ+ |
| Open Source Code | No | The paper does not provide concrete access to the source code for the methodology described in this paper. A link to a third-party baseline (SM) is provided, but not to the authors' own implementation. |
| Open Datasets | Yes | We report the perplexity of KN, MKN, SM, and our approach CN using the Finnish (FI), Spanish (ES), German (DE), English (EN), French (FR), portions of the Europarl v7 [Koehn, 2005] corpus, as well as 250Mi B, 500Mi B, 1,2,4, and 8Gi B chunks of English Common Crawl corpus [Buck et al., 2014]. |
| Dataset Splits | No | The paper uses "newstest-2014" and "newstest-2013" as test sets, and provides train and test set sizes in Table 4, but it does not specify explicit validation dataset splits (e.g., percentages or counts for a validation set). |
| Hardware Specification | Yes | All experiments are done on a single core on Intel Xeon E5-2667 3.2GHz and 180Gi B of RAM. |
| Software Dependencies | No | The paper mentions using the SRILM toolkit for Kneser-Ney and Modified Kneser-Ney perplexity measurements, but it does not specify a version number for SRILM or any other software dependencies with version numbers. |
| Experiment Setup | Yes | In our model, the discount parameters are set to Kneser-Ney discounts and tied based on the context size |u|, while each distribution uses its own separate concentration parameter. ... Instead, we follow a non-uniform sampling by shrinking the range to 1 tu w min{M, nu w} (Here M = 10). |