reproducibilityindex.ai

Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation

Authors: Runzhe Zhan, Xuebo Liu, Derek F. Wong, Lidia S. Chao14310-14318

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on 10 different low-resource domains show that meta-curriculum learning can improve the translation performance of both familiar and unfamiliar domains.
Researcher Affiliation	Academia	Runzhe Zhan, Xuebo Liu, Derek F. Wong , Lidia S. Chao NLP2CT Lab, Department of Computer and Information Science, University of Macau nlp2ct.{runzhe,xuebo}@gmail.com, {derekfw,lidiasc}@um.edu.mo
Pseudocode	Yes	Algorithm 1 Meta-Curriculum Learning Policy
Open Source Code	Yes	All the codes and data are freely available at https://github.com/NLP2CT/ Meta-Curriculum.
Open Datasets	Yes	The dataset for domain adaptation is made up of ten parallel En-De corpora (Bible-uedin (Christodouloupoulos and Steedman 2015), Books, ECB, EMEA, Global Voices, JRCAcquis, KDE4, TED2013, WMT-News.v2019) which are publicly available at OPUS3 (Tiedemann 2012).
Dataset Splits	Yes	For each task T , the token amount of support set S and query set Q would be approximately limited to 8k and 16k, respectively. Table 1 shows the detailed statistics.
Hardware Specification	No	The paper discusses the software toolkit (fairseq) and optimizers used, but does not provide specific hardware details such as GPU or CPU models, or memory specifications for running experiments.
Software Dependencies	No	The paper mentions the use of 'fairseq' toolkit, 'Moses tokenizer', and 'sentencepieces', but does not provide specific version numbers for these software components.
Experiment Setup	Yes	Both of them were trained using Adam optimizer (Kingma and Ba 2015) (β1 = 0.9, β2 = 0.98), but with different learning rates (lrnlm = 5e 4, lrﬁnetune = 5e 5, lrtranslation = 7e 4, lrmeta = 1e 5). The learning rate scheduler and warm-up policy (nwarmup = 4000) for training the vanilla Transformer is the same as the Vaswani et al. (2017) work. Furthermore, the number of the updating epochs during the adaptation period would be strictly limited to 20 to simulate quick adaptation and verify the robustness under limited settings.