Unsupervised Domain Adaptation on Reading Comprehension
Authors: Yu Cao, Meng Fang, Baosheng Yu, Joey Tianyi Zhou7480-7487
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show our approach achieves comparable performance to supervised models on multiple large-scale benchmark datasets. |
| Researcher Affiliation | Collaboration | Yu Cao,1 Meng Fang,2 Baosheng Yu,1 Joey Tianyi Zhou3 1UBTECH Sydney AI Center, School of Computer Science, FEIT, The University of Sydney, Australia 2Department of Computer Science, University of Waikato, New Zealand 3Institute of High Performance Computing, A*STAR, Singapore |
| Pseudocode | Yes | Algorithm 1: CASe. Given a BERT feature network F, an output network G, and a discriminator D. |
| Open Source Code | Yes | Code available at: https://github.com/caoyu1991/CASe |
| Open Datasets | Yes | SQUAD (Rajpurkar et al. 2016) contains 87k training samples and 11k validation (dev) samples, with questions in natural language given by workers based on paragraphs from Wikipeida. ... CNN and DAILYMAIL (Hermann et al. 2015) contains 374k training and 4k dev samples... |
| Dataset Splits | Yes | SQUAD (Rajpurkar et al. 2016) contains 87k training samples and 11k validation (dev) samples... Table 1: Characterizations of datasets after processing. (lists Train and Dev sample counts) |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) were found in the paper. It only mentions using a 'BERT implementation in Py Torch' and a 'base-uncased pretrained model'. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'Adam optimizer' but does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | Adam optimizer (Kingma and Ba 2014) is employed with learning rate 3 10 5 in the source domain training, 2 10 5 in the self-training and 10 5 in the adversarial learning, with batch size 12. A dropout with rate 0.2 is applied on both the BERT feature network and the discriminator. We set the epoch number Npre = 3 in pre-training and Nda = 4 in domain adaptation. ... Generating probability threshold Tprob is set as 0.4 and nbest = 20. |