reproducibilityindex.ai

Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model

Authors: Juntao Li, Ruidan He, Hai Ye, Hwee Tou Ng, Lidong Bing, Rui Yan

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our proposed method achieves signiﬁcant performance improvements over the state-of-the-art pretrained crosslingual language model in the CLCD setting.
Researcher Affiliation	Collaboration	Juntao Li1,2 , Ruidan He3 , Hai Ye2 , Hwee Tou Ng2 , Lidong Bing3 and Rui Yan1 1Center for Data Science, Academy for Advanced Interdisciplinary Studies, Peking University 2Department of Computer Science, National University of Singapore 3DAMO Academy, Alibaba Group
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to its source code, nor does it include a specific repository link or an explicit code release statement for the methodology described.
Open Datasets	Yes	We conduct experiments on the multi-lingual and multi-domain Amazon review dataset [Prettenhofer and Stein, 2010]
Dataset Splits	Yes	There are a training set and a test set for each domain in each language and both consist of 1,000 positive reviews and 1,000 negative reviews. ... We utilize 100 labeled data in the target language and target domain as the validation set, which is used for hyperparameter tuning and model selection during training.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using specific models like XLM and optimizers like Adam, but it does not provide specific version numbers for software dependencies or libraries used in their implementation.
Experiment Setup	Yes	The hidden dimension of XLM is 1024. The input and output dimensions of the feedforward layers in both Fs and Fp are 1024. ... All trainable parameters are initialized from a uniform distribution [ 0.1, 0.1]. ... both UFD and the task-speciﬁc module are optimized by Adam [Kingma and Ba, 2014] with a learning rate of 1 10 4. The batch size of training UFD and the task-speciﬁc module are set to 16 and 8, respectively. The weights α, β, γ in Equation (6) are set to 1, 0.3, and 1, respectively.