reproducibilityindex.ai

Dual Inference for Machine Learning

Authors: Yingce Xia, Jiang Bian, Tao Qin, Nenghai Yu, Tie-Yan Liu

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical studies on three pairs of speciﬁc dual tasks, including machine translation, sentiment analysis, and image processing have illustrated that dual inference can signiﬁcantly improve the performance of each of individual tasks. Empirical studies on dual tasks under these three specific domains have shown that dual inference can signiﬁcantly improve the inference performance of each of individual tasks. Table 1 compares the BLEU scores by various inference methods... Figure 1 shows the BLEU scores by dual inference... Table 3 compares the accuracy of sentiment classiﬁcation... Table 5 shows the error rate of two image classiﬁers...
Researcher Affiliation	Collaboration	1University of Science and Technology of China, Hefei, Anhui, China 2Microsoft Reasearch Asia, Beijing, China
Pseudocode	Yes	The dual inference for the primal task of neural machine translation is shown as follows: 1. Translate source x with beam search by model f and get K candidates ˆyi i [K]; (K is beam size) 2. i = arg mini [K] αℓf(x, ˆyi) + (1 α)ℓg(x, ˆyi), where ℓf and ℓg are deﬁned in Eqn.(4) 3. Return ˆyi as the translation of x.
Open Source Code	No	The paper does not provide any specific repository links, explicit code release statements, or mention code in supplementary materials for the methodology described.
Open Datasets	Yes	For NMT: 'the bilingual training data are part of WMT 14, consisting of 4.5M for En De and 12M for En Fr sentences pairs, respectively. Data from http://www.statmt.org/wmt14/translation-task.html'. For Sentiment Analysis: 'we use the IMDB movie review dataset [Maas et al., 2011]... http://ai.stanford.edu/ amaas/data/sentiment/'. For Image Processing: 'we use CIFAR-10 dataset'.
Dataset Splits	Yes	For NMT: 'We concatenate newstest2012 and newstest2013 as the validation sets and use newstest2014 as the test sets'. For Sentiment Analysis: 'We split 3750 sentences from the training as the validation set'. For Image Processing: 'We split 5k images away from the training data as the validation set'.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper mentions various models and tools such as 'RNNSearch', 'dual-NMT', 'LSTM based RNN', 'Pixel CNN++', and 'multi-bleu.pl', but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For dual inference: 'α and β are hyperparameters to balance the tradeoff between two losses, and they will be tuned based on performance on a validation set'. For NMT: 'K candidates ˆyi i [K]; (K is beam size)'. For Sentiment Analysis: 'we set 500 dimension embedding size and 1024 dimension hidden node size'.