Discriminative Reordering Model Adaptation via Structural Learning
Authors: Biao Zhang, Jinsong Su, Deyi Xiong, Hong Duan, Junfeng Yao
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on the NIST Chinese-to-English translation task to evaluate our model. |
| Researcher Affiliation | Academia | Xiamen University, Xiamen, China 3610051 Soochow University, Suzhou, China 2150062 |
| Pseudocode | No | The paper describes its model and learning steps in text but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the described methodology. |
| Open Datasets | Yes | The out-of-domain (newswire) training corpus comes from the FBIS corpus and Hansards part of LDC2004T07 corpus. We used the Chinese Sohu weblog in 20092 and the English Blog Authorship corpus3 as the in-domain (weblog) monolingual corpora in the source language and target language, respectively. (2http://blog.sohu.com/ 3http://u.cs.biu.ac.il/ koppel/Blog Corpus.html) |
| Dataset Splits | Yes | we used the web part of NIST 06 MT evaluation test data as our development set to obtain the optimal parameters |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used for experiments. |
| Software Dependencies | No | The paper mentions several software tools like GIZA++, mkcls tool, Classias toolkit, SRILM toolkit, and SVDLIBC, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | Following Prettenhofer and Stein [2010], we empirically set the number of iterations and the regularization parameter as 10^6 and 10^-5 for SGD, respectively. We empirically set ϵ1 = ϵ2 = 1000, sl = sr = 3, and used the Lanczos algorithm implemented by SVDLIBC9 to compute the SVD of the dense parameter matrix W, similar to Blitzer et al. [2006]. In this process, negative values in W were set as 0 to yield a sparse representation. Considering the tradeoff between performance and efficiency, we conducted SMT experiments setting the reduced dimension to 80 and the pivot feature number to 200. |