Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Enhancing Cross-lingual Transfer by Manifold Mixup
Authors: Huiyun Yang, Huadong Chen, Hao Zhou, Lei Li
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the XTREME benchmark show X-MIXUP achieves 1.8% performance gains on multiple text understanding tasks, compared with strong baselines, and reduces the cross-lingual representation discrepancy significantly. |
| Researcher Affiliation | Collaboration | Huiyun Yang1, Huadong Chen1, Hao Zhou 1, Lei Li2 1Byte Dance AI Lab 2University of California, Santa Barbara |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. Method details are described in paragraph text and mathematical equations. |
| Open Source Code | Yes | Corresponding author. Code is available at https://github.com/yhy1117/X-Mixup. |
| Open Datasets | Yes | We utilize the translate-train and translate-test data from the XTREME repo5, which also provide the pseudo-label of translate-train data for classification tasks and question answering tasks. The rest translation data are from Google Translate6. 5https://github.com/google-research/xtreme. |
| Dataset Splits | Yes | We select XNLI, POS, and MLQA as representative tasks to search for the best hyper-parameters. The final model is selected based on the averaged performance of all languages on the dev set. |
| Hardware Specification | Yes | For all tasks, we fine-tune on 8 Nvidia V100-32GB GPU cards with the batch size 64. |
| Software Dependencies | No | The paper mentions using 'Huggingface Transformers' as the backbone model but does not provide specific version numbers for it or any other software dependencies. |
| Experiment Setup | Yes | We perform grid search over the balance training parameter α and learning rate from [0.2, 0.4, 0.6, 0.8] and [3e-6, 5e-6, 2e-5, 3e-5]. We also search for the best manifold mixup layer from [1, 4, 8, 12, 16, 20, 24]. |