Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Authors: Runtian Zhai, Bingbin Liu, Andrej Risteski, J Zico Kolter, Pradeep Kumar Ravikumar
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 5, we demonstrate this point with mask-type augmentations on synthetic and real datasets, and show that (i) κ depends on both the augmentation strength and the augmentation strategy; (ii) a smaller κ (e.g. stronger augmentation) leads to a smaller generalization gap, but an overly strong augmentation causes poor training performance. Thus, there is a sweet spot in the middle with the best test performance. |
| Researcher Affiliation | Academia | Runtian Zhai, Bingbin Liu, Andrej Risteski, Zico Kolter, Pradeep Ravikumar Carnegie Mellon University {rzhai,bingbinl,aristesk,zkolter,pradeepr}@cs.cmu.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code of Section 5 can be found at https://colab.research.google.com/drive/1lo SZLLI-qfo KE7BCIi1SWJKgru U6i4ku?usp=sharing. |
| Open Datasets | Yes | We demonstrate on the NLP dataset wikipedia-simple. We study masked language modeling, where x is a full sentence and a is a masked sentence. using QNLI (Wang et al., 2018) and SST-2 (Socher et al., 2013). |
| Dataset Splits | No | The paper uses QNLI and SST-2 datasets but does not explicitly state the training, validation, and test splits used for these datasets. |
| Hardware Specification | Yes | We use 8 NVIDIA A6000 GPUs for pretraining. We use 4 NVIDIA A6000 GPUs for downstream training and evaluation. |
| Software Dependencies | No | The paper mentions using "roberta-large models" and the "Huggingface official repository" but does not provide specific version numbers for software dependencies such as libraries, frameworks, or programming languages. |
| Experiment Setup | Yes | The classifiers are trained for 3 epochs on QNLI and 6 epochs on SST-2. |