reproducibilityindex.ai

Improving Non-Autoregressive Translation Models Without Distillation

Authors: Xiao Shi Huang, Felipe Perez, Maksims Volkovs

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on multiple public NMT datasets: IWSLT 14 De-En/En-De, WMT 14 De En/En-De, and WMT 16 Ro-En/En-Ro. We use the same training/validation/test sets as in previous work (Ghazvininejad et al., 2019) and report test set performance in BLEU for direct comparison. For each dataset we compute performance on both raw and distilled settings, resulting in 12 dataset in total.
Researcher Affiliation	Industry	Xiao Shi Huang, Felipe Pérez, Maksims Volkovs Layer 6 AI {gary,felipe,maks}@layer6.ai
Pseudocode	Yes	Algorithm 1: CMLMC Training
Open Source Code	Yes	Code for this work is available here: https://github.com/layer6ai-labs/CMLMC.
Open Datasets	Yes	We evaluate our approach on multiple public NMT datasets: IWSLT 14 De-En/En-De, WMT 14 De En/En-De, and WMT 16 Ro-En/En-Ro. We use the same training/validation/test sets as in previous work (Ghazvininejad et al., 2019)
Dataset Splits	Yes	We use the same training/validation/test sets as in previous work (Ghazvininejad et al., 2019)
Hardware Specification	Yes	and we train the models on the IBM servers with 160 POWER9 CPUs, 600GB RAM and 4 Tesla V100 GPUs (32G).
Software Dependencies	No	The paper mentions using the Fairseq library and Adam optimizer but does not provide specific version numbers for these software components.
Experiment Setup	Yes	Hyper-parameters for each dataset are selected through grid search and are listed in Table B.1 in Appendix.