Multilingual Transfer Learning for QA using Translation as Data Augmentation

Authors: Mihaela Bornea, Lin Pan, Sara Rosenthal, Radu Florian, Avirup Sil12583-12591

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we show that the proposed models outperform the previous zero-shot baseline on the recently introduced multilingual MLQA and TYDI QA datasets.
Researcher Affiliation Industry Mihaela Bornea, Lin Pan, Sara Rosenthal, Radu Florian, Avirup Sil IBM Research AI, Thomas J. Watson Research Center, Yorktown Heights, NY 10598 {mabornea,panl,sjrosenthal,raduf,avi}@us.ibm.com
Pseudocode Yes Algorithm 1 Pseudo-code for adversarial training on the multilingual QA task. ... Algorithm 2 Pseudo-code for our language arbitration framework for the multilingual QA task.
Open Source Code No The paper does not provide an explicit statement or link for the source code of the described methodology. It mentions using 'IBM Watson Language Translator' but this is a service, not their code release.
Open Datasets Yes We train our models on the SQu AD v1.1 dataset (details in Table 1). ... TYDI QA: ... train our models on SQu AD v1.1...
Dataset Splits Yes We perform hyper-parameter selection on the SQu AD and MLQA dev split.
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU or CPU models.
Software Dependencies No The paper mentions 'MBERTQA' and 'MBERT' but does not provide specific version numbers for any underlying software libraries or dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We use 3 10 5 as the learning rate, 384 as maximum sequence length, and a doc stride of 128. Everything except ZS was trained for 1 epoch. ... The discriminator is implemented as a multilayer perceptron with 2 hidden layers and a hidden size of 768 4.