reproducibilityindex.ai

Optimal Sample Complexity of Contrastive Learning

Authors: Noga Alon, Dmitrii Avdiukhin, Dor Elboim, Orr Fischer, Grigory Yaroslavtsev

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further show that the theoretical bounds on sample complexity obtained via VC/Natarajan dimension can have strong predictive power for experimental results, in contrast with the folklore belief about a substantial gap between the statistical learning theory and the practice of deep learning. (Abstract) and Experimental results To verify that our results indeed correctly predict the sample complexity, we perform experiments on several popular image datasets: CIFAR-10/100 and MNIST/Fashion MNIST. We find the representations for these images using Res Net18 trained from scratch using various contrastive losses. Our experiments show that for a fixed number of samples, the error rate is well approximated by the value predicted by our theory. We present our findings in Appendix F. (Section 1.1)
Researcher Affiliation	Academia	1Princeton University, 2Northwestern University, 3Institute for Advanced Study, 4Weizmann Institute of Science, 5George Mason University
Pseudocode	No	The paper describes mathematical proofs and algorithms in prose but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include an unambiguous statement that the authors are releasing their code for the described methodology, nor does it provide a direct link to such a repository.
Open Datasets	Yes	we perform experiments on several popular image datasets: CIFAR-10/100 and MNIST/Fashion MNIST. (Section 1.1) and We train the model from scratch on CIFAR-10 (Krizhevsky, 2009) and Fashion-MNIST (Xiao et al., 2017) datasets (Appendix F) and We train the model from scratch on the MNIST (Yann, 1998) and CIFAR-100 (Krizhevsky, 2009) datasets (Appendix F).
Dataset Splits	Yes	The neural network is trained from scratch for 100 epochs using a set of m {102, 103, 104} training samples, and is evaluated on a different test set of 104 triplets from the same distribution. (Appendix F) and We perform experiments on the training set of CIFAR-10 and the validation set of Image Net by training Res Net-18 from scratch on m {2, 10, 102, 103, 104, 105} randomly sampled triplets, and evaluating the model on the 104 triplets sampled from the same distribution (Appendix F).
Hardware Specification	No	No specific hardware details (such as GPU/CPU models, memory, or cloud instance types) used for running the experiments are mentioned in the paper.
Software Dependencies	Yes	We express our thanks to the FFCV library (Leclerc et al., 2022) which allowed us to significantly speed up the execution (Appendix F).
Experiment Setup	Yes	The neural network is trained from scratch for 100 epochs using a set of m {102, 103, 104} training samples (Appendix F), We train the model from scratch on CIFAR-10 ... using the marginal triplet loss (Schroff et al., 2015b) LMT (x, y+, z ) = max(0, x y+ 2 x z 2 + 1), (Appendix F), and LC(x, y+, z 1 , . . . , z k ) = log exp(x T y+) / (exp(x T y+) + Pk i=1 exp(x T z i )) (Appendix F).