reproducibilityindex.ai

A Fast Variational Approach for Learning Markov Random Field Language Models

Authors: Yacine Jernite, Alexander Rush, David Sontag

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, we demonstrate the quality of the models learned by our algorithm by applying it to a language modelling task. Additionally we show that this same estimation algorithm can be effectively applied to other common sequence modelling tasks such as part-of-speech tagging.
Researcher Affiliation	Collaboration	Yacine Jernite JERNITE@CS.NYU.EDU CIMS, New York University, 251 Mercer Street, New York, NY 10012, USA Alexander M. Rush SRUSH@SEAS.HARVARD.EDU Facebook AI Research, 770 Broadway, New York, NY 10003, USA David Sontag DSONTAG@CS.NYU.EDU CIMS, New York University, 251 Mercer Street, New York, NY 10012, USA
Pseudocode	Yes	Algorithm 1 Tightening the bound; Algorithm 2 Gradient ascent
Open Source Code	No	The paper states: 'Our implementation of the algorithm uses the Torch numerical framework (http://torch.ch/) and runs on the GPU for efﬁciency.' This refers to a third-party framework used, not the authors' own source code for their method.
Open Datasets	Yes	For language modelling we ran experiments on the Penn Treebank (PTB) corpus with the standard language modelling setup: sections 0-20 for training (N = 930k), sections 21-22 for validation (N = 74k) and sections 23-24 (N = 82k) for test.
Dataset Splits	Yes	For language modelling we ran experiments on the Penn Treebank (PTB) corpus with the standard language modelling setup: sections 0-20 for training (N = 930k), sections 21-22 for validation (N = 74k) and sections 23-24 (N = 82k) for test.
Hardware Specification	No	The paper states: 'Our implementation of the algorithm uses the Torch numerical framework (http://torch.ch/) and runs on the GPU for efﬁciency.' This mention of 'the GPU' is too general and does not specify any model or hardware details.
Software Dependencies	No	The paper mentions 'Our implementation of the algorithm uses the Torch numerical framework (http://torch.ch/)'. However, it does not specify a version number for Torch or any other software dependencies.
Experiment Setup	Yes	For model parameter optimization (the gradient step in Algorithm 2) we use L-BFGS (Liu & Nocedal, 1989) with backtracking line-search. For tightening the bound (Algorithm 1), we used 200 sub-gradient iterations, each requiring a round of belief propagation. Our sub-gradient rate parameter α was set as α = 103/2t where t is the number of preceding iterations where the dual objective did not decrease.