reproducibilityindex.ai

KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Low-Resource NLP

Authors: Yufei Wang, Jiayi Zheng, Can Xu, Xiubo Geng, Tao Shen, Chongyang Tao, Daxin Jiang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that i) the synthetic data produced by Know DA successfully improves performance of the strong pre-trained language models (i.e., Bert, ALBert and Deberta) by a large margin on the low-resource NLP benchmark Few GLUE, Co NLL 03 and Wiki Ann; ii) Know DA successfully transfer the task knowledge to NLP tasks whose types are seen and unseen in Ko MT.
Researcher Affiliation	Collaboration	Yufei Wang 1 , Jiayi Zheng 2, Can Xu 3, Xiubo Geng 3, Tao Shen 3, Chongyang Tao 3, Daxin Jiang 3 Macquarie University, Sydney, Australia1, Peking University, Beijing, China2 Microsoft Corporation, Beijing, China3
Pseudocode	No	The paper describes procedures in natural language and illustrates them with figures, but it does not include formal pseudocode blocks or algorithm listings.
Open Source Code	Yes	The source code is released in https://github.com/Gary Yufei/ICLR2023_Know DA.
Open Datasets	Yes	We conduct low-source experiments on the Few GLUE (Schick & Sch utze, 2020), Co NLL 03 (Sang & De Meulder, 2003), and Wiki Ann (Pan et al., 2017) benchmarks. ... Similar to Ye et al. (2021), we select English monolingual datasets with open access in the Huggingface Datasets (Lhoest et al., 2021).
Dataset Splits	No	The paper mentions running experiments multiple times with different random seeds and data splits, and specifies the number of training examples (e.g., 32 for Few GLUE, 40 for CoNLL 03, 30 for Wiki Ann) but does not provide specific percentages or counts for a validation dataset split.
Hardware Specification	Yes	We train Know DA for 100k steps with a maximum sequence length of 512 and batch size 2048 in a Linux environment with 16 A100 GPU (40G). Fine-tuning Know DA is carried out only using a single A100 GPU (40G).
Software Dependencies	No	The paper mentions software like "T5-1.1-Large model" and "Adam as the optimizer" but does not specify version numbers for general software dependencies or libraries such as Python, PyTorch, etc.
Experiment Setup	Yes	We train Know DA for 100k steps with a maximum sequence length of 512 and batch size 2048 in a Linux environment with 16 A100 GPU (40G). Fine-tuning Know DA is carried out only using a single A100 GPU (40G). We use Adam as the optimizer to train Know DA. ... we simply fine-tune Know DA (i.e., updating all parameters) with batch size 12 with learning rate of 5e 6 for 500 steps.