Understanding Dimensional Collapse in Contrastive Self-supervised Learning

Authors: Li Jing, Pascal Vincent, Yann LeCun, Yuandong Tian

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that Direct CLR outperforms Sim CLR with a trainable linear projector on Image Net. We train a Sim CLR model (Chen et al. (2020a)) with a two-layer MLP projector. We followed the standard recipe and trained the model on Image Net for 100 epoch. We evaluate the dimensionality by collecting the embedding vectors on the validation set.
Researcher Affiliation Industry Li Jing, Pascal Vincent, Yann Le Cun, Yuandong Tian Facebook AI Research {ljng, pascal, yann, yuandong}@fb.com
Pseudocode No No pseudocode or algorithm blocks were found.
Open Source Code Yes Code (in Py Torch) is available at https://github.com/facebookresearch/directclr
Open Datasets Yes We train a Sim CLR model (Chen et al. (2020a)) with a two-layer MLP projector. We followed the standard recipe and trained the model on Image Net for 100 epoch.
Dataset Splits Yes We evaluate the dimensionality by collecting the embedding vectors on the validation set. The spectrums are computed based on the output from the backbone, using Imgae Net validation set.
Hardware Specification No The batch size is 4096, which fits into 32 GPUs during training. However, specific GPU models or other detailed hardware specifications are not mentioned.
Software Dependencies No The paper mentions 'Code (in Py Torch)' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes We use a LARS optimizer and train all models for 100 epochs. The batch size is 4096, which fits into 32 GPUs during training. The learning rate is 4.8 as in Sim CLR (Chen et al., 2020a), which goes through a 10 epoch of warming up and then a cosine decay schedule.