reproducibilityindex.ai

Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning

Authors: Zixin Wen, Yuanzhi Li

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we veriﬁed that the feature decoupling principle matches the underlying mechanism of contrastive learning in practice.Empirical evidence of our theory. Empirically, we conduct multiple experiments to justify our theoretical statements, and the results indeed matches our theory. We show: When no proper augmentation is applied to the data, the neural network will learn features with dense patterns. As shown in Figure 2, Figure 3 and Figure 4
Researcher Affiliation	Academia	1University of International Business and Economics, Beijing 2Carnegie Mellon University. Correspondence to: Zixin Wen <zixinw@andrew.cmu.edu>, Yuanzhi Li <yuanzhil@andrew.cmu.edu>.
Pseudocode	No	The paper describes the training algorithm in narrative text within Section 2.2 'Training algorithm using SGD' but does not provide a formal pseudocode or algorithm block.
Open Source Code	No	The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement, or mention of code in supplementary materials) for the described methodology.
Open Datasets	Yes	Figure 1. The difference between supervised features and contrastive features (in the higher layers of Wide-Res Net 34x5 over CIFAR10).Figure 3. Evidence supporting our theoretical framework: the effects of augmentations on the learned representations of Wide-Res Net 34x5 over CIFAR10 visualized via t-SNE. The differences bewteen features learned under different augmentations shows that the neural networks will indeed learn dense representations if augmentation is not powerful enough.Figure 4. Another evidence supporting our theoretical framework. After adding the color distortion to augmentation, the neurons of Alex Net (2nd to 5th layer) exhibit sparser ﬁring patterns over input images of CIFAR10.
Dataset Splits	No	The paper mentions using CIFAR-10 for empirical verification but does not explicitly state the dataset split information (e.g., specific percentages, sample counts, or a citation to a predefined split) used for training, validation, or testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup	Yes	We use a single-layer neural net f : Rd1 Rm with Re LU activation as our contrastive learner, where m is the number of neurons. More precisely, it is deﬁned as follows: f(x) = (h1(x), . . . , hm(x)) Rm, hi(x) = Re LU( wi, x bi) Re LU( wi, x bi). We initialize the parameters by w(0) i N(0, σ2 0Id1) and b(0) i = 0, where σ2 0 = Θ( 1 d1poly(d)) is small (and also theoretically friendly). For each iteration t, let η = 1 poly(d) be the learning rate, we update as: w(t+1) i w(t) i η wi Obj(ft). Let m = d1.01 be the number of neurons, τ = polylog(d), and \|N\| = poly(d) be the number of negative samples.