KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Low-Resource NLP
Authors: Yufei Wang, Jiayi Zheng, Can Xu, Xiubo Geng, Tao Shen, Chongyang Tao, Daxin Jiang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that i) the synthetic data produced by Know DA successfully improves performance of the strong pre-trained language models (i.e., Bert, ALBert and Deberta) by a large margin on the low-resource NLP benchmark Few GLUE, Co NLL 03 and Wiki Ann; ii) Know DA successfully transfer the task knowledge to NLP tasks whose types are seen and unseen in Ko MT. |
| Researcher Affiliation | Collaboration | Yufei Wang 1 , Jiayi Zheng 2, Can Xu 3, Xiubo Geng 3, Tao Shen 3, Chongyang Tao 3, Daxin Jiang 3 Macquarie University, Sydney, Australia1, Peking University, Beijing, China2 Microsoft Corporation, Beijing, China3 |
| Pseudocode | No | The paper describes procedures in natural language and illustrates them with figures, but it does not include formal pseudocode blocks or algorithm listings. |
| Open Source Code | Yes | The source code is released in https://github.com/Gary Yufei/ICLR2023_Know DA. |
| Open Datasets | Yes | We conduct low-source experiments on the Few GLUE (Schick & Sch utze, 2020), Co NLL 03 (Sang & De Meulder, 2003), and Wiki Ann (Pan et al., 2017) benchmarks. ... Similar to Ye et al. (2021), we select English monolingual datasets with open access in the Huggingface Datasets (Lhoest et al., 2021). |
| Dataset Splits | No | The paper mentions running experiments multiple times with different random seeds and data splits, and specifies the number of training examples (e.g., 32 for Few GLUE, 40 for CoNLL 03, 30 for Wiki Ann) but does not provide specific percentages or counts for a validation dataset split. |
| Hardware Specification | Yes | We train Know DA for 100k steps with a maximum sequence length of 512 and batch size 2048 in a Linux environment with 16 A100 GPU (40G). Fine-tuning Know DA is carried out only using a single A100 GPU (40G). |
| Software Dependencies | No | The paper mentions software like "T5-1.1-Large model" and "Adam as the optimizer" but does not specify version numbers for general software dependencies or libraries such as Python, PyTorch, etc. |
| Experiment Setup | Yes | We train Know DA for 100k steps with a maximum sequence length of 512 and batch size 2048 in a Linux environment with 16 A100 GPU (40G). Fine-tuning Know DA is carried out only using a single A100 GPU (40G). We use Adam as the optimizer to train Know DA. ... we simply fine-tune Know DA (i.e., updating all parameters) with batch size 12 with learning rate of 5e 6 for 500 steps. |