reproducibilityindex.ai

Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation

Authors: Cunxiao Du, Zhaopeng Tu, Jing Jiang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on major WMT benchmarks show that OAXE substantially improves translation performance, setting new state of the art for fully NAT models.
Researcher Affiliation	Collaboration	1School of Computing and Information System, Singapore Management University, Singapore. Work was done when Cunxiao Du was under the Rhino-Bird Elite Training Program of Tencent AI Lab. 2Tencent AI Lab, China.
Pseudocode	Yes	We use Hungarian algorithm to efﬁciently implement OAXE (e.g., 7 lines of core code, see Appendix A.1)
Open Source Code	Yes	Our code, data, and trained models are available at https://github.com/ tencent-ailab/ICML21_OAXE.
Open Datasets	Yes	We conducted experiments on major benchmarking datasets that are widely-used in previous NAT studies (Gu et al., 2018; Shao et al., 2020; Ma et al., 2019; Saharia et al., 2020): WMT14 English German (En De, 4.5M sentence pairs), WMT16 English Romanian (En Ro, 0.6M sentence pairs). ... We use the dataset released by Ott et al. (2018) for evaluating translation uncertainty, which consists of ten human translations for 500 sentences taken from the WMT14 En-De test set.
Dataset Splits	Yes	The training set consists of 300K instances, in which the target is an ordering sampled from a given set of ordering modes from a categorical distribution. Both the validation and test sets consist of 3K instances and all the ordering modes serves as the references for the test sets.
Hardware Specification	No	The paper mentions that Hungarian Match was implemented with a CPU-version python package and that training speed is 1.36 times slower, but it does not provide specific details about the CPU or any other hardware components like GPU models, memory, or cloud instance types used for experiments.
Software Dependencies	No	The paper mentions software like 'python package scipy', 'PyTorch', 'Adam', and 'Fairseq' but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	We trained batches of approximately 128K tokens using Adam (Kingma & Ba, 2015). The learning rate warmed up to 5 10 4 in the ﬁrst 10K steps, and then decayed with the inverse square-root schedule. We trained all models for 300k steps, measured the validation BLEU at the end of each epoch, and averaged the 5 best checkpoints.