reproducibilityindex.ai

MEDA: Meta-Learning with Data Augmentation for Few-Shot Text Classification

Authors: Pengfei Sun, Yawen Ouyang, Wenming Zhang, Xin-yu Dai

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiment results show that on both datasets, MEDA outperforms existing state-of-the-art methods and signiﬁcantly improves the performance of meta-learning on few-shot text classiﬁcation.
Researcher Affiliation	Academia	National Key Laboratory for Novel Software Technology, Nanjing University {spf, ouyangyw, zhangwm}@smail.nju.edu.cn, daixinyu@nju.edu.cn
Pseudocode	Yes	Algorithm 1 Training Strategy
Open Source Code	No	The paper does not provide an explicit statement or link indicating that the source code for the proposed MEDA method is openly available. It only mentions using third-party libraries like BERT and Sentence-Transformers.
Open Datasets	Yes	To prove the effectiveness of our proposed model, we evaluate the MEDA on two publicly datasets for the few-shot scenario: SNIPS1 [Coucke et al., 2018] and ARSC2 [Yu et al., 2018]. 1https://github.com/snipsco/nlu-benchmark/ 2https://github.com/Gorov/Diverse Few Shot Amazon
Dataset Splits	Yes	SNIPS. As SNIPS is not a benchmark for few-shot learning, we ﬁrst construct few-shot splits to simulate the fewshot scenario. We divide the original 7 intents into 5 intents as Ctrain and 2 intents (Add To Playlist, Rate Book) as Ctest. The Ctrain and Ctest are used as training set and test set respectively. Thus, we evaluate the performance on Ctest, i.e., 2-way-K-shot settings, where K=3,5,10. ARSC. For ARSC dataset, we partition datasets following [Yu et al., 2018]... we also select 12 (4 × 3) tasks from four domains (Books, DVD, Electronics, Kitchen) as the test set... All hyper-parameters of the MEDA are cross-validated on the validation set using a coarse grid search.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory, or cloud instances) used for running experiments are mentioned in the paper.
Software Dependencies	No	Our implementation is based on Pytorch3. We experiment with pre-trained BERT [Devlin et al., 2019] using Sentence-Transformers codebase [Reimers and Gurevych, 2019]. Adam [Kingma and Ba, 2015] is used to train the MEDA in an end-to-end manner. (No version numbers provided for Pytorch or Sentence-Transformers).
Experiment Setup	Yes	The initial learning rate is 1e-3. In the loss function, we set λ = 1 and r = 1. To avoid overﬁtting, we use dropout with 0.2 dropout rate. We generate 10 samples using different augmentation methods for the given class.