reproducibilityindex.ai

Meta Back-Translation

Authors: Hieu Pham, Xinyi Wang, Yiming Yang, Graham Neubig

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our evaluations in both the standard datasets WMT En De 14 and WMT En-Fr 14, as well as a multilingual translation setting, our method leads to signiﬁcant improvements over strong baselines.
Researcher Affiliation	Academia	Anonymous authors Paper under double-blind review
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. Figure 1 is an illustrative example, not pseudocode.
Open Source Code	No	The paper states it uses existing architectures and frameworks ('Transformer-Base architecture (Vaswani et al., 2017)' and 'fairseq (Ott et al., 2019)') but does not provide a link or explicit statement for its own source code for Meta BT.
Open Datasets	Yes	For the standard setting, we consider two large datasets: WMT En-De 2014 and WMT En-Fr 20141, tokenized with Sentence Piece (Kudo & Richardson, 2018) using a joint vocabulary size of 32K for each dataset. ... The multilingual setting uses the multilingual TED talk dataset (Qi et al., 2018).
Dataset Splits	Yes	For the standard setting, we consider two large datasets: WMT En-De 2014 and WMT En-Fr 20141 [footnote to http://www.statmt.org/wmt14/]. ... we also have a separate validation set for hyper-parameter tuning and model selection.
Hardware Specification	Yes	All experiments are conducted on 8 NVIDIA V100 GPUs.
Software Dependencies	No	The paper mentions 'fairseq (Ott et al., 2019)' and 'Adam (Kingma & Ba, 2015)' but does not provide specific version numbers for these software components.
Experiment Setup	Yes	Optimizer: Adam (Kingma & Ba, 2015) with β1 = 0.9 and β2 = 0.98. The initial learning rate is 5e-4 and is warmed up for 4000 steps, then decayed using inverse square root. Label smoothing: 0.1. Dropout: 0.3. Min-max batching for parallel data, with 4096 tokens per batch. For monolingual data, the batch size is 64 sentences for WMT En-De, 16 for WMT En-Fr, and 8 for the multilingual data.