Do Generated Data Always Help Contrastive Learning?
Authors: Yifei Wang, Jizhe Zhang, Yisen Wang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that the proposed approach improves downstream accuracy significantly at no extra cost, and it is particularly beneficial for data-scarce scenarios. |
| Researcher Affiliation | Academia | 1 School of Mathematical Sciences, Peking University 2 Institute of Artificial Intelligence and Robotics, Xi an Jiaotong University 3 National Key Lab of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University 4 Institute for Artificial Intelligence, Peking University |
| Pseudocode | No | No pseudocode or algorithm blocks are present in the paper. |
| Open Source Code | Yes | Code is available at https://github.com/PKU-ML/adainf. |
| Open Datasets | Yes | We conduct experiments on three benchmark datasets: CIFAR-10, CIFAR-100, and Tiny Image Net. |
| Dataset Splits | No | The paper does not explicitly provide details about training/validation/test splits or mention the use of a validation set for hyperparameter tuning. It focuses on training and testing data. |
| Hardware Specification | Yes | All of the models are pretrained with 4 NVIDIA GTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions using the "solo-learn library (da Costa et al., 2022)" but does not specify a version number for this or any other software dependency. |
| Experiment Setup | Yes | For a fair comparison of inflated and non-inflated training, we train the model for 100k steps in all cases... Specifically, we weaken two most important augmentations: the min scale of random resized cropping improves from 0.08 to 0.2; the Color Jitter strength decreases from 1 to 0.5; and the probability of applying Color Jitter decreases from 0.8 to 0.4. |