Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model
Authors: Juntao Li, Ruidan He, Hai Ye, Hwee Tou Ng, Lidong Bing, Rui Yan
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our proposed method achieves significant performance improvements over the state-of-the-art pretrained crosslingual language model in the CLCD setting. |
| Researcher Affiliation | Collaboration | Juntao Li1,2 , Ruidan He3 , Hai Ye2 , Hwee Tou Ng2 , Lidong Bing3 and Rui Yan1 1Center for Data Science, Academy for Advanced Interdisciplinary Studies, Peking University 2Department of Computer Science, National University of Singapore 3DAMO Academy, Alibaba Group |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to its source code, nor does it include a specific repository link or an explicit code release statement for the methodology described. |
| Open Datasets | Yes | We conduct experiments on the multi-lingual and multi-domain Amazon review dataset [Prettenhofer and Stein, 2010] |
| Dataset Splits | Yes | There are a training set and a test set for each domain in each language and both consist of 1,000 positive reviews and 1,000 negative reviews. ... We utilize 100 labeled data in the target language and target domain as the validation set, which is used for hyperparameter tuning and model selection during training. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using specific models like XLM and optimizers like Adam, but it does not provide specific version numbers for software dependencies or libraries used in their implementation. |
| Experiment Setup | Yes | The hidden dimension of XLM is 1024. The input and output dimensions of the feedforward layers in both Fs and Fp are 1024. ... All trainable parameters are initialized from a uniform distribution [ 0.1, 0.1]. ... both UFD and the task-specific module are optimized by Adam [Kingma and Ba, 2014] with a learning rate of 1 10 4. The batch size of training UFD and the task-specific module are set to 16 and 8, respectively. The weights α, β, γ in Equation (6) are set to 1, 0.3, and 1, respectively. |