Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
A Fast Variational Approach for Learning Markov Random Field Language Models
Authors: Yacine Jernite, Alexander Rush, David Sontag
ICML 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, we demonstrate the quality of the models learned by our algorithm by applying it to a language modelling task. Additionally we show that this same estimation algorithm can be effectively applied to other common sequence modelling tasks such as part-of-speech tagging. |
| Researcher Affiliation | Collaboration | Yacine Jernite EMAIL CIMS, New York University, 251 Mercer Street, New York, NY 10012, USA Alexander M. Rush EMAIL Facebook AI Research, 770 Broadway, New York, NY 10003, USA David Sontag EMAIL CIMS, New York University, 251 Mercer Street, New York, NY 10012, USA |
| Pseudocode | Yes | Algorithm 1 Tightening the bound; Algorithm 2 Gradient ascent |
| Open Source Code | No | The paper states: 'Our implementation of the algorithm uses the Torch numerical framework (http://torch.ch/) and runs on the GPU for efficiency.' This refers to a third-party framework used, not the authors' own source code for their method. |
| Open Datasets | Yes | For language modelling we ran experiments on the Penn Treebank (PTB) corpus with the standard language modelling setup: sections 0-20 for training (N = 930k), sections 21-22 for validation (N = 74k) and sections 23-24 (N = 82k) for test. |
| Dataset Splits | Yes | For language modelling we ran experiments on the Penn Treebank (PTB) corpus with the standard language modelling setup: sections 0-20 for training (N = 930k), sections 21-22 for validation (N = 74k) and sections 23-24 (N = 82k) for test. |
| Hardware Specification | No | The paper states: 'Our implementation of the algorithm uses the Torch numerical framework (http://torch.ch/) and runs on the GPU for efficiency.' This mention of 'the GPU' is too general and does not specify any model or hardware details. |
| Software Dependencies | No | The paper mentions 'Our implementation of the algorithm uses the Torch numerical framework (http://torch.ch/)'. However, it does not specify a version number for Torch or any other software dependencies. |
| Experiment Setup | Yes | For model parameter optimization (the gradient step in Algorithm 2) we use L-BFGS (Liu & Nocedal, 1989) with backtracking line-search. For tightening the bound (Algorithm 1), we used 200 sub-gradient iterations, each requiring a round of belief propagation. Our sub-gradient rate parameter α was set as α = 103/2t where t is the number of preceding iterations where the dual objective did not decrease. |