Evaluating Self-Supervised Learning via Risk Decomposition
Authors: Yann Dubois, Tatsunori Hashimoto, Percy Liang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide efficient estimators for each component and use them to analyze the effect of 30 design choices on 169 SSL vision models evaluated on Image Net. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Stanford University. Correspondence to: Yann Dubois <yanndubs@cs.stanford.edu>. |
| Pseudocode | Yes | For formal derivations, properties, and pseudocode see Appx. B. As a reminder, the encoder is always pretrained on Str. |
| Open Source Code | Yes | All results and pretrained models are at github.com/ Yann Dubs/SSL-Risk-Decomposition |
| Open Datasets | Yes | Evaluating Self-Supervised Learning via Risk Decomposition... evaluated on Image Net. |
| Dataset Splits | Yes | Compared to supervised learning, the main new challenge for estimating our risk components compared to supervised learning is that pretraining additional SSL encoders is computationally prohibitive, so we want each of our estimators to use the same SSL encoder... In particular, for Image Net we have |Ssub| = 5e4 and |Str| > 1e6. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for its experiments, such as specific GPU models, CPU types, or cloud computing instance specifications. |
| Software Dependencies | Yes | To evaluate full-shot linear probing we use Py Torch (Paszke et al., 2019) and tune the following hyperparameters... We train one XGBoost model (Chen & Guestrin, 2016)... we now use sklearn s (Pedregosa et al., 2011) logistic regression with the lbfgs solver... |
| Experiment Setup | Yes | For each model, we collected 30 design choices or hyperparameters, estimated our error components, and evaluated the Image Net test performance of well-tuned linear probes trained in different subsets of Image Net (100%, 30-shot, 1%, 5-shot, 3-shot). |