reproducibilityindex.ai

Learning Data Manipulation for Augmentation and Weighting

Authors: Zhiting Hu, Bowen Tan, Russ R. Salakhutdinov, Tom M. Mitchell, Eric P. Xing

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show the resulting algorithms signiﬁcantly improve the image and text classiﬁcation performance in low data regime and class-imbalance problems.
Researcher Affiliation	Collaboration	Zhiting Hu1,2 , Bowen Tan1 , Ruslan Salakhutdinov1, Tom Mitchell1, Eric P. Xing1,2 1Carnegie Mellon University, 2Petuum Inc. {zhitingh,btan2,rsalakhu,tom.mitchell}@cs.cmu.edu, eric.xing@petuum.com
Pseudocode	Yes	Algorithm 1 Joint Learning of Model and Data Manipulation
Open Source Code	Yes	Code available at https://github.com/tanyuqian/learning-data-manipulation
Open Datasets	Yes	For text classiﬁcation, we use the popular benchmark datasets, including SST-5 for 5-class sentence sentiment [45], IMDB for binary movie review sentiment [31], and TREC for 6-class question types [30]. For image classiﬁcation, we similarly create a small subset of the CIFAR10 data...
Dataset Splits	Yes	We subsample a small training set on each task by randomly picking 40 instances for each class. We further create small validation sets, i.e., 2 instances per class for SST-5, and 5 instances per class for IMDB and TREC, respectively. For image classiﬁcation, we similarly create a small subset of the CIFAR10 data, which includes 40 instances per class for training, and 2 instances per class for validation.
Hardware Specification	Yes	All experiments were implemented with Py Torch (pytorch.org) and were performed on a Linux machine with 4 GTX 1080Ti GPUs and 64GB RAM.
Software Dependencies	No	The paper mentions 'Py Torch (pytorch.org)' but does not provide specific version numbers for software dependencies.
Experiment Setup	Yes	For both the BERT classiﬁer and the augmentation model (which is also based on BERT), we use Adam optimization with an initial learning rate of 4e-5. For Res Nets, we use SGD optimization with a learning rate of 1e-3. For text data augmentation, we augment each minibatch by generating two or three samples for each data points (each with 1, 2 or 3 substitutions), and use both the samples and the original data to train the model. ... we restrict the training to small number (e.g., 5 or 10) of epochs.