Feature Dropout: Revisiting the Role of Augmentations in Contrastive Learning
Authors: Alex Tamkin, Margalit Glasgow, Xiluo He, Noah Goodman
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform contrastive learning experiments on a range of image and audio datasets with multiple downstream tasks (e.g. synthetic datasets combining two classes, such as images and digits, and naturalistic datasets labeled with dozens of attributes). ... Finally, we formalize the intuition that feature dropout can aid learning with a theoretical analysis of a simple linear contrastive setting. |
| Researcher Affiliation | Academia | Alex Tamkin, Margalit Glasgow, Xiluo He, Noah Goodman Stanford University |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is open-sourced at https://github.com/xiluohe/feature-dropout. |
| Open Datasets | Yes | The three image datasets are based on the canonical CIFAR-10 image-recognition dataset [28] (MIT-License)... The CIFAR-10 images are overlaid with four copies of a randomly-sampled digit from the MNIST dataset... The audio dataset is created by overlaying the audio of a spoken digit (from the Audio MNIST dataset [3], MIT License)... To further validate the behavior of viewmaker on realistic multi-feature datasets, we consider the Celeb A [32] dataset... |
| Dataset Splits | No | The paper describes training epochs, batch sizes, and optimizers for pretraining and linear evaluation, but it does not explicitly specify the training, validation, and test dataset splits with percentages or counts for reproducing the data partitioning. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions software components like 'Sim CLR algorithm', 'Res Net-18 model', 'Spec Aug', 'Wave Aug', and 'SGD' but does not provide specific version numbers for any of these or other software dependencies. |
| Experiment Setup | Yes | We pretrain with the Sim CLR algorithm for 200 epochs with a batch size of 256 and a temperature of 0.1. ... We train for 100 epochs, using the same parameters as Tamkin et al. [49], using SGD with learning rate 0.01, momentum 0.9, weight decay 0, and batch size 128. ... we use a budget of ϵ = 0.05P for the image datasets, and ϵ = 0.125P for the audio datasets, where P is the number of pixels in the input. |