reproducibilityindex.ai

Exploring the Gap between Collapsed & Whitened Features in Self-Supervised Learning

Authors: Bobby He, Mete Ozay

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide theoretical & empirical evidence highlighting the factors in SSL, like projection layers & regularisation strength, that inﬂuence eigenvalue decay rate, & demonstrate that the degree of feature whitening affects generalisation, particularly in label scarce regimes. We use our insights to motivate a novel method, Post-hoc Manipulation of the Principal Axes & Trace (Post Man-Pat), which efﬁciently post-processes a pretrained encoder to enforce eigenvalue decay rate with power law exponent β, & ﬁnd that Post Man-Pat delivers improved label efﬁciency and transferability across a range of SSL methods and encoder architectures.
Researcher Affiliation	Collaboration	1University of Oxford, 2Samsung Research UK.
Pseudocode	Yes	Algorithm 1 Py Torch pseudocode for Post Man-Pat (PMP).
Open Source Code	No	The paper mentions using existing open-source codebases for baselines and pretrained models (e.g., Sim CLR, Barlow Twins, Sw AV implementations and checkpoints from VISSL library's model zoo), but it does not state that the code for its own proposed method, Post Man-Pat (PMP), is publicly available or provide a link to its implementation.
Open Datasets	Yes	In Figure 4, we compare various Res Net-18 trained with Barlow Twins on CIFAR-10...Our Image Net-1K implementation was based off the official Barlow Twins (Zbontar et al., 2021) implemtation3...STL-10 analysis Figure 7 is akin to Figure 4, but trained with Barlow Twins on STL-10 dataset...Table 2. Transfer Learning: Comparison of top-1 test accuracies (%) for PMP and LP across SSL methods and transfer datasets...CIFAR100 (Krizhevsky, 2009), Stanford Cars (Krause et al., 2013) and Oxford 102 Flowers (Nilsback & Zisserman, 2008)
Dataset Splits	Yes	We use a validation split from the accessible training labels to tune hyperparameters for all evaluation schemes, c.f. Appendix C. ...For any given set of labelled data, we split the data into 4:1 splits for the 1% or 10% labelled-data setting, or 2:1 splits for the 0.3% labelled-data setting (as in the 0.3% setting we have only 3 labels per class). Splits were chosen uniformly at random so that each class had an equal number of examples in the larger split, which was then used for training. Top-1 accuracy on the smaller split was used for hyperparameter tuning.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used (e.g., GPU models, CPU models, memory specifications) for running its experiments. It mentions model architectures like ResNet-18, ResNet-50, and ViT-B/16 but not the underlying hardware.
Software Dependencies	No	The paper mentions software like 'PyTorch' and 'Torchvision Py Torch library (Paszke et al., 2019)' and 'VISSL library’s (Goyal et al., 2021) model zoo', but it does not specify concrete version numbers for these software components, which is required for reproducible description.
Experiment Setup	Yes	In Appendix C, titled 'Experimental Details', the paper provides specific details regarding the training procedure and hyperparameters. For instance, in C.1: 'All networks were trained with SGD loss for 100 epochs with weight decay 0.0004, momentum 0.9 & a cosine annealed learning rate...Learning rate was 0.32 for Sim CLR & 0.25 for Barlow Twins, with ρ = 0.01.' C.4 further details: 'In all linear evaluation schemes...we train the classiﬁer for 100 epochs using SGD & momentum 0.9, with a cosine annealed learning rate starting at 0.1, with weight decay tuned in all cases'.