Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation
Authors: Jinglin Liu, Yi Ren, Xu Tan, Chen Zhang, Tao Qin, Zhou Zhao, Tie-Yan Liu
IJCAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on IWSLT14 De En, IWSLT16 En-De, WMT14 En-De and De-En datasets show that TCL-NAT achieves significant accuracy improvements over previous NAT baselines and reduces the performance gap between NAT and AT models to 1-2 BLEU points, demonstrating the effectiveness of our proposed method. |
| Researcher Affiliation | Collaboration | 1Zhejiang University 2Microsoft Research Asia EMAIL, EMAIL |
| Pseudocode | No | The paper describes its model architecture and process with textual descriptions and a figure (Figure 1), but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for their method. |
| Open Datasets | Yes | We evaluate our method on three standard translation datasets: IWSLT14 German-to-English (De-En) dataset2, IWSLT16 English-to-German (En-De) dataset3 and WMT14 English-to-German (En-De) dataset4. Following Li et al. [2019], we reverse WMT14 English-to-German to get WMT14 German-to-English (De-En) dataset. |
| Dataset Splits | Yes | IWSLT14 dataset contains 153k/7k/7k parallel bilingual sentences for training/dev/test set respectively; IWSLT16 dataset contains 195k/1k/1k parallel bilingual sentences for training/dev/test set and WMT14 dataset contains 4.5M parallel sentence pairs for training sets, where newstest2014 and newstest2013 are used as test and validation set respectively, following previous works [Gu et al., 2018; Guo et al., 2019b]. |
| Hardware Specification | Yes | We run the training procedure on 8 NVIDIA Tesla P100 GPUs for WMT and 2 NVIDIA 2080Ti GPUs for IWSLT datasets respectively. |
| Software Dependencies | No | The paper states, 'We implement our model on Tensor2Tensor [Vaswani et al., 2018].' However, it does not specify the version number of Tensor2Tensor or any other software dependencies with version numbers. |
| Experiment Setup | Yes | We follow Guo et al. [2019b] for configuration hyperparameters: For WMT14 datasets, we use the hyperparameters of a base Transformer (dmodel = dhidden = 512, nlayer = 6, nhead = 8). For IWSLT14 and IWSLT16 datasets, we utilize a small Transformer (dmodel = dhidden = 256, nlayer = 6, nhead = 4). ... We train all models using Adam following the optimizer settings and learning rate schedule in Transformer [Vaswani et al., 2017]. ... The training steps of each phase are listed in Table 2. |