Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Topic-Based Dissimilarity and Sensitivity Models for Translation Rule Selection

Authors: M. Zhang, X. Xiao, D. Xiong, Q. Liu

JAIR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on Chinese-English translation tasks (Section 7) show that our topic-based translation rule selection model can substantially improve translation quality.
Researcher Affiliation	Academia	Min Zhang EMAIL Provincial Key Laboratory for Computer Information Processing Technology, Soochow University, Suzhou, China Xinyan Xiao EMAIL IIP Key Lab, Institute of Computing Technology, Chinese Academy of Sciences, China Deyi Xiong EMAIL Provincial Key Laboratory for Computer Information Processing Technology, Soochow University, Suzhou, China Qun Liu EMAIL CNGL, School of Computing, Dublin City University, Ireland IIP Key Lab, Institute of Computing Technology, Chinese Academy of Sciences, China
Pseudocode	No	The paper describes methods using mathematical formulations and descriptive text, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code	No	We used the open source LDA tool Gibbs LDA++.6 Gibss LDA++ is an implementation of LDA using gibbs sampling for parameter estimation and inference.
Open Datasets	Yes	We carried out our experiments on NIST Chinese-to-English translation. We used the NIST evaluation set of 2005 (MT05) as our development set, and sets of MT06/MT08 as the test sets. ... In our medium-scale experiments, we used the FBIS corpus as our bilingual training data... In our large-scale experiments, the bilingual training data consists of LDC2003E14, LDC2004T07, LDC2005T06, LDC2005T10 and LDC2004T08 (Hong Kong Hansards/Laws/News). ... We used the SRILM toolkit (Stolcke, 2002) to train language models on the Xinhua portion of the GIGAWORD corpus
Dataset Splits	Yes	We used the NIST evaluation set of 2005 (MT05) as our development set, and sets of MT06/MT08 as the test sets. ... In our medium-scale experiments, we used the FBIS corpus as our bilingual training data... In our large-scale experiments, the bilingual training data consists of LDC2003E14, LDC2004T07, LDC2005T06, LDC2005T10 and LDC2004T08 (Hong Kong Hansards/Laws/News).
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU or CPU models.
Software Dependencies	No	We used the SRILM toolkit (Stolcke, 2002) to train language models... We obtained symmetric word alignments of training data by ﬁrst running GIZA++ (Och & Ney, 2003)... We used the open source LDA tool Gibbs LDA++.
Experiment Setup	Yes	We used minimum error rate training (MERT) (Och, 2003) to optimize the feature weights. ... We set the number of topic K = 30 for both the sourceand target-side topic models, and used the default setting of the tool for training and inference. ... We trained a 4-gram language model for our medium-scale experiments and a 5-gram language model for our large-scale experiments. ... we ran the tuning process three times for all our large scale experiments and presented the average BLEU scores on the three runs