reproducibilityindex.ai

Nonlinear Mixup: Out-Of-Manifold Data Augmentation for Text Classification

Authors: Hongyu Guo4044-4051

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on benchmark sentence classiﬁcation datasets indicate that our approach signiﬁcantly improves upon Mixup. Our empirical studies also show that the out-of-manifold samples generated by our strategy encourage training samples in each class to form a tight representation cluster that is far from others.
Researcher Affiliation	Academia	Hongyu Guo National Research Council Canada 1200 Montreal Road, Ottawa, ON., K1A 0R6 hongyu.guo@nrc-cnrc.gc.ca
Pseudocode	No	The paper describes the methods using equations and prose but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to the source code for the methodology described in this paper.
Open Datasets	Yes	TREC is a question dataset with the aim of categorizing a question into six question types (Li and Roth 2002). MR is a movie review dataset aiming to detect positive/negative reviews (Pang and Lee 2005). SST-1 is the Stanford Sentiment Treebank with ﬁve categories of very positive, positive, neural, negative and very negative (Socher et al. 2013). SST-2 dataset is the same as SST-1 but with neutral reviews removed and binary labels. Subj is a data set with the aim of classifying a sentence as being subjective or objective (Pang and Lee 2004).
Dataset Splits	Yes	For datasets without a standard development set we randomly select 10% of training data as development set.
Hardware Specification	No	The paper does not provide specific hardware details (such as GPU or CPU models, or memory) used for running its experiments.
Software Dependencies	No	The paper mentions "Adam" as the optimizer and "GloVe" for word embeddings, but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	Speciﬁcally, we use ﬁlter sizes of 3, 4, and 5, each with 100 feature maps; dropout rate of 0.5 and L2 regularization of 0.2 for the baseline CNN. For datasets without a standard development set we randomly select 10% of training data as development set. Training is done through Adam (Kingma and Ba 2014) over mini-batches of size 50. The pre-trained word embeddings are 300 dimensional Glo Ve (Pennington, Socher, and Manning 2014). For the nonlinear Mixup the mixing policy α is set to the default value of one. The dimension of the label embedding in the nonlinear Mixup is 100. For each dataset, we train each model 10 times each with 80k steps, and compute their mean test errors and standard deviations.