Robust Optimization for Multilingual Translation with Imbalanced Data
Authors: Xian Li, Hongyu Gong
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We ran experiments on common benchmarks (TED, WMT and OPUS-100) with varying degrees of data imbalance. CATS effectively improved multilingual optimization and as a result demonstrated consistent gains on low resources (+0.8 to +2.2 BLEU) without hurting high resources. |
| Researcher Affiliation | Industry | Facebook AI {xianl, hygong}@fb.com |
| Pseudocode | Yes | Algorithm 1 Curvature Aware Task Scaling (CATS). |
| Open Source Code | No | The paper does not provide an explicit statement about the release of its own source code, nor does it include a link to a code repository for its methodology. |
| Open Datasets | Yes | Datasets. We experiment on three public benchmarks of multilingual machine translation with varying characteristics of imbalanced data as is shown in Table 1. ...TED [53] WMT[34] OPUS-100[62]... |
| Dataset Splits | Yes | We choose the best checkpoint by validation perplexity and only use the single best model without ensembling. We use the same preprocessed data by the Multi DDS baseline authors [53], and followed the same procedure to preprocess OPUS-100 data released by the baseline [62]. |
| Hardware Specification | No | The paper mentions training 'with the same compute budget' but does not provide specific hardware details such as GPU/CPU models or memory. |
| Software Dependencies | No | The paper mentions using the 'Transformer architecture' and cites 'fairseq' as a toolkit, but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We provide detailed training hyperparameters in Appendix B. |