reproducibilityindex.ai

Maximum Reconstruction Estimation for Generative Latent-Variable Models

Authors: Yong Cheng, Yang Liu, Wei Xu

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on unsupervised part-of-speech induction and unsupervised word alignment show that our approach enables generative latent-variable models to better discover intended correlations in data and outperforms maximum likelihood estimators signiﬁcantly.
Researcher Affiliation	Academia	Yong Cheng,# Yang Liu, Wei Xu# #Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer Science and Technology, Tsinghua University, Beijing, China chengyong3001@gmail.com {liuyang2011, weixu}@tsinghua.edu.cn
Pseudocode	No	The paper describes dynamic programming algorithms in text but does not include any structured pseudocode blocks or algorithms.
Open Source Code	No	The paper does not provide any links to open-source code or explicitly state that the code for the described methodology is publicly available.
Open Datasets	Yes	We split the English Penn Treebank into two parts: 46K sentences for training and test and 1K sentences for optimizing hyper-parameters of the exponentiated gradient (EG) algorithm with adaptive learning rate. ... We used the FBIS corpus as the training corpus, which contains 240K Chinese-English parallel sentences with 6.9M Chinese words and 8.9M English words.
Dataset Splits	Yes	We split the English Penn Treebank into two parts: 46K sentences for training and test and 1K sentences for optimizing hyper-parameters of the exponentiated gradient (EG) algorithm with adaptive learning rate. ... We used the Tsinghua Aligner development and test sets (Liu and Sun 2015), which both contain 450 sentence pairs with gold-standard annotations.
Hardware Specification	No	The paper does not provide any specific hardware details used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	The EM algorithm for maximum likelihood estimation runs for 100 iterations and the EG algorithm with adaptive learning rate runs for 50 iterations with initialization of a basic HMM (Ammar, Dyer, and Smith 2014). The number of hidden states in HMMs is set to 50, which is close to the size of the POS tag set. ... Both MLE and MRE use the following training scheme: 5 iterations for IBM Model 1 and 5 iterations for IBM Model 2.