Maximum Reconstruction Estimation for Generative Latent-Variable Models
Authors: Yong Cheng, Yang Liu, Wei Xu
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on unsupervised part-of-speech induction and unsupervised word alignment show that our approach enables generative latent-variable models to better discover intended correlations in data and outperforms maximum likelihood estimators significantly. |
| Researcher Affiliation | Academia | Yong Cheng,# Yang Liu, Wei Xu# #Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer Science and Technology, Tsinghua University, Beijing, China chengyong3001@gmail.com {liuyang2011, weixu}@tsinghua.edu.cn |
| Pseudocode | No | The paper describes dynamic programming algorithms in text but does not include any structured pseudocode blocks or algorithms. |
| Open Source Code | No | The paper does not provide any links to open-source code or explicitly state that the code for the described methodology is publicly available. |
| Open Datasets | Yes | We split the English Penn Treebank into two parts: 46K sentences for training and test and 1K sentences for optimizing hyper-parameters of the exponentiated gradient (EG) algorithm with adaptive learning rate. ... We used the FBIS corpus as the training corpus, which contains 240K Chinese-English parallel sentences with 6.9M Chinese words and 8.9M English words. |
| Dataset Splits | Yes | We split the English Penn Treebank into two parts: 46K sentences for training and test and 1K sentences for optimizing hyper-parameters of the exponentiated gradient (EG) algorithm with adaptive learning rate. ... We used the Tsinghua Aligner development and test sets (Liu and Sun 2015), which both contain 450 sentence pairs with gold-standard annotations. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | The EM algorithm for maximum likelihood estimation runs for 100 iterations and the EG algorithm with adaptive learning rate runs for 50 iterations with initialization of a basic HMM (Ammar, Dyer, and Smith 2014). The number of hidden states in HMMs is set to 50, which is close to the size of the POS tag set. ... Both MLE and MRE use the following training scheme: 5 iterations for IBM Model 1 and 5 iterations for IBM Model 2. |