Predicting What You Already Know Helps: Provable Self-Supervised Learning

Authors: Jason D. Lee, Qi Lei, Nikunj Saunshi, JIACHENG ZHUO

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments verifying our theoretical findings are in Section 6. Simulations. With synthetic data, we verify how excess risk (ER) scales... Computer Vision Task. We verify if learning from ψ is more effective than learning directly from X1, in a realistic setting
Researcher Affiliation Academia Jason D. Lee1, Qi Lei1, Nikunj Saunshi1, Jiacheng Zhuo2 1 Princeton University 2 University of Texas at Austin {jasonlee@,qilei@,nsaunshi@cs}.princeton.edu, jzhuo@utexas.edu
Pseudocode No The paper describes the SSL process in two steps using mathematical formulations (Section 4, Equation 2) but does not present a formally labeled "Pseudocode" or "Algorithm" block.
Open Source Code No The codes will be made public after this work is accepted for publish.
Open Datasets Yes We test on the Yearbook dataset [23]
Dataset Splits No The paper mentions specific sample counts for simulation (e.g., "n1 = 4000, n2 = 1000") for pretext and downstream tasks. For the computer vision task, it refers to "full set of training data (without labels)" and a "smaller set of data (with labels)". However, it does not explicitly specify train/validation/test splits with percentages or distinct counts for each of these three sets to allow for reproducible data partitioning.
Hardware Specification Yes Our experiments were run on a server with an NVIDIA RTX 2080 Ti GPU
Software Dependencies No The image pre-processing is done by torchvision, and the model is built with pytorch framework.
Experiment Setup Yes We set d1 = 50, d2 = 40, n1 = 4000, n2 = 1000 and ER is measured with Mean Squared Error (MSE). We resize all the portraits to be 128 by 128. We crop out the center 64 by 64 pixels (the face), and treat it as X2, and treat the outer rim as X1