Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
The Power of Contrast for Feature Learning: A Theoretical Analysis
Authors: Wenlong Ji, Zhun Deng, Ryumei Nakada, James Zou, Linjun Zhang
JMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Despite its empirical success, theoretical understanding of the superiority of contrastive learning is still limited. In this paper, under linear representation settings, (i) we provably show that contrastive learning outperforms the standard autoencoders and generative adversarial networks... We verify our theory with numerical experiments. |
| Researcher Affiliation | Academia | Wenlong Ji EMAIL Department of Statistics Stanford University Stanford, CA 94305, USA Zhun Deng EMAIL Department of Computer Science Columbia University New York, NY 10027, USA Ryumei Nakada EMAIL Department of Statistics Rutgers University Piscataway, NJ 08854, USA James Zou EMAIL Department of Biomedical Data Science Stanford University Stanford, CA 94305, USA Linjun Zhang EMAIL Department of Statistics Rutgers University Piscataway, NJ 08854, USA |
| Pseudocode | No | The paper describes methodologies through mathematical formulations and prose, such as the formulation of contrastive loss functions in Equations (1), (2), (3), (4) and the data generating process in Equation (5), but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | Our codes are implemented in Pytorch and run on an NVIDIA V100 GPU. No explicit statement of code release or repository link is provided for the methodology described in this paper. |
| Open Datasets | Yes | We conduct the experiments using the datasets STL-10 (Coates et al., 2011) and CIFAR10 (Krizhevsky, 2009) with the neural nets architecture Res Net-18 (He et al., 2016). |
| Dataset Splits | Yes | For both STL-10 and CIFAR-10 datasets, we divide the test data into two sets, one consists of the first five classes and the other one consists of the remaining five classes. During training, we use the training data as unlabeled data and the first set of test data as the labeled data to train the model jointly, and then train a linear classifier with the second set of test data on features learned by the encoder. |
| Hardware Specification | Yes | Our codes are implemented in Pytorch and run on an NVIDIA V100 GPU. |
| Software Dependencies | No | The paper mentions 'Pytorch' as the implementation framework and 'Adam optimizer (Kingma and Ba, 2015)', but specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | All training is carried out with the Adam optimizer (Kingma and Ba, 2015), batch size 256, learning rate 3 10 4, weight decay 10 4, and a cosine annealing learning rate scheduler for 100 epochs. |