reproducibilityindex.ai

C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling

Authors: Yutai Hou, Sanyuan Chen, Wanxiang Che, Cheng Chen, Ting Liu13027-13035

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on ATIS and Snips datasets show that instances augmented by C2C-Gen DA improve slot ﬁlling by 7.99 (11.9% ) and 5.76 (13.6% ) F-scores respectively, when there are only hundreds of training utterances.
Researcher Affiliation	Academia	Yutai Hou, Sanyuan Chen, Wanxiang Che , Cheng Chen, Ting Liu Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, China {ythou, sychen, car, tliu}@ir.hit.edu.cn, 170400202@stu.hit.edu.cn
Pseudocode	Yes	Algorithm 1: Dispersed Cluster Pairing
Open Source Code	Yes	Code: https://github.com/Sanyuan-Chen/C2C-DA.
Open Datasets	Yes	We conduct experiments on ATIS and Snips datasets. ATIS (Hemphill, Godfrey, and Doddington 1990) is extensively used for slot ﬁlling and provides a well-founded comparison for data augmentation methods. [...] Snips (Coucke et al. 2018) dataset is collected from the Snips personal voice assistant.
Dataset Splits	Yes	We use a development set of 500 instances. [...] We use another 700 utterances as the development set.
Hardware Specification	No	The paper mentions using a transformer model and GPT-2, but does not specify any hardware details like GPU/CPU models used for training or inference.
Software Dependencies	No	The paper mentions software components like 'transformer implemented by Wolf et al. (2019)', 'GPT-2', 'Adam W (Loshchilov and Hutter 2019) optimizer', 'Bi-LSTM', 'GloVe (Pennington, Socher, and Manning 2014)', and 'Adam (Kingma and Ba 2015)', but it does not provide specific version numbers for any of these software components or libraries.
Experiment Setup	Yes	We used Adam W (Loshchilov and Hutter 2019) optimizer with initial learning rate 6.25e-5 or 5e-5 for training. We varied λ in {0.1, 0.02, 0.01, 0.002, 0.001} and set γ as 1.0. [...] The dimension of word embeddings and hidden states was set to 300 and 128, respectively. We used Glo Ve (Pennington, Socher, and Manning 2014) to initialize word embedding. We varied training batch size in {16, 128}, set dropout rate to 0.5, and trained the model with Adam as suggested by Kingma and Ba (2015).