Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Data Manipulation for Augmentation and Weighting
Authors: Zhiting Hu, Bowen Tan, Russ R. Salakhutdinov, Tom M. Mitchell, Eric P. Xing
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show the resulting algorithms significantly improve the image and text classification performance in low data regime and class-imbalance problems. |
| Researcher Affiliation | Collaboration | Zhiting Hu1,2 , Bowen Tan1 , Ruslan Salakhutdinov1, Tom Mitchell1, Eric P. Xing1,2 1Carnegie Mellon University, 2Petuum Inc. EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Joint Learning of Model and Data Manipulation |
| Open Source Code | Yes | Code available at https://github.com/tanyuqian/learning-data-manipulation |
| Open Datasets | Yes | For text classification, we use the popular benchmark datasets, including SST-5 for 5-class sentence sentiment [45], IMDB for binary movie review sentiment [31], and TREC for 6-class question types [30]. For image classification, we similarly create a small subset of the CIFAR10 data... |
| Dataset Splits | Yes | We subsample a small training set on each task by randomly picking 40 instances for each class. We further create small validation sets, i.e., 2 instances per class for SST-5, and 5 instances per class for IMDB and TREC, respectively. For image classification, we similarly create a small subset of the CIFAR10 data, which includes 40 instances per class for training, and 2 instances per class for validation. |
| Hardware Specification | Yes | All experiments were implemented with Py Torch (pytorch.org) and were performed on a Linux machine with 4 GTX 1080Ti GPUs and 64GB RAM. |
| Software Dependencies | No | The paper mentions 'Py Torch (pytorch.org)' but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | For both the BERT classifier and the augmentation model (which is also based on BERT), we use Adam optimization with an initial learning rate of 4e-5. For Res Nets, we use SGD optimization with a learning rate of 1e-3. For text data augmentation, we augment each minibatch by generating two or three samples for each data points (each with 1, 2 or 3 substitutions), and use both the samples and the original data to train the model. ... we restrict the training to small number (e.g., 5 or 10) of epochs. |