Understanding Contrastive Learning Requires Incorporating Inductive Biases
Authors: Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on image and text domains highlight the ubiquity of this problem different function classes and algorithms behave very differently on downstream tasks, despite having the same augmentations and contrastive losses. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, Princeton University 2Microsoft Research, New York City 3Departments of Computer Science & Statistics, Harvard University. |
| Pseudocode | No | The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present any structured, code-like procedures within its text or appendices. |
| Open Source Code | No | The paper refers to third-party implementations used (e.g., 'We use the ViT implementation from https://github.com/lucidrains/vit-pytorch'), but it does not include an explicit statement about releasing its own source code for the methodology described. |
| Open Datasets | Yes | We use the AG News classification dataset (Zhang et al., 2015) We consider the setting of CIFAR-10 image classification, where the augmentation distribution for contrastive learning is derived from the popular SimCLR protocol (Chen et al., 2020). |
| Dataset Splits | Yes | The training set (under which Lcont is minimized) is of size 50000; the downstream accuracies under a linear classifier are evaluated on a holdout validation set of 12500. At the start of contrastive learning, we create a held-out validation set of pairs of augmentation sampled for 10,000 randomly chosen examples from the original validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or cloud computing instance types, used for running the experiments. |
| Software Dependencies | No | The paper mentions using libraries like 'torchtext' and implementations from GitHub (e.g., 'https://github.com/lucidrains/vit-pytorch'), but it does not provide specific version numbers for any software dependencies or frameworks. |
| Experiment Setup | Yes | Table 3. Hyperparameter values for experiments on CIFAR-10 trained using ResNet-18. Table 4. Hyperparameter values for experiments on AG News. |