Negative Data Augmentation

Authors: Abhishek Sinha, Kumar Ayush, Jiaming Song, Burak Uzkent, Hongxia Jin, Stefano Ermon

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities. Further, we incorporate the same negative data augmentation strategy in a contrastive learning framework for self-supervised representation learning on images and videos, achieving improved performance on downstream image classification, object detection, and action recognition tasks. These results suggest that prior knowledge on what does not constitute valid data is an effective form of weak supervision across a range of unsupervised learning tasks.
Researcher Affiliation Collaboration Abhishek Sinha1 Kumar Ayush1 Jiaming Song1 Burak Uzkent1 Hongxia Jin2 Stefano Ermon1 Department of Computer Science1 Stanford University {a7b23, kayush, tsong, buzkent, ermon}@stanford.edu Samsung Research America2
Pseudocode No The paper does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code to reproduce our experiments is given here.
Open Datasets Yes We conduct experiments on various datasets using the Big GAN architecture (Brock et al., 2018) for unconditional image generation. [...] CIFAR-10 (C10), (b) CIFAR100 (C100), and (c) Image Net-100 (Deng et al., 2009) to show the benefits of NDA on representation learning with the contrastive loss function.
Dataset Splits No The paper explicitly states train and test splits for some datasets (e.g., "CIFAR-10 contains 60K 32x32 images with 10 labels, out of which 50K are used for training and 10K are used for testing"), but it does not explicitly provide details for a separate validation split within these mentions, nor does it specify how validation was used (e.g., for hyperparameter tuning or early stopping) in a way that provides reproducible split information.
Hardware Specification No The paper does not specify any particular hardware components such as GPU models, CPU types, or cloud computing instance details used for running the experiments.
Software Dependencies No The paper mentions using "Py Torch implementation" for experiments but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes Additional details about the network architectures and hyperparameters can be found in Appendix K. For CIFAR-10, CIFAR-100, and Celeb A we train for 500 epochs whereas for STL-10 we train for 300 epochs. For all the datasets we use the following hyperparameters: batch-size = 64, generator learning rate = 2e-4, discriminator learning rate = 2e-4, discriminator update steps per generator update step = 4. For CIFAR-10 and CIFAR-100, we use the following hyperparameters during pre-training: batch-size = 256, learning-date = 0.3, temperature = 0.07, feature dimensionality = 2048. For Image Net-100 pretraining we have the following: batch-size = 128, learning-date = 0.015, temperature = 0.2, feature dimensionality = 128. During linear classification we use a batch size of 256 for all the datasets and learning rate of 10 for CIFAR-10, CIFAR-100, whereas for Image Net-100 we use learning rate of 30.