Understanding Dimensional Collapse in Contrastive Self-supervised Learning
Authors: Li Jing, Pascal Vincent, Yann LeCun, Yuandong Tian
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that Direct CLR outperforms Sim CLR with a trainable linear projector on Image Net. We train a Sim CLR model (Chen et al. (2020a)) with a two-layer MLP projector. We followed the standard recipe and trained the model on Image Net for 100 epoch. We evaluate the dimensionality by collecting the embedding vectors on the validation set. |
| Researcher Affiliation | Industry | Li Jing, Pascal Vincent, Yann Le Cun, Yuandong Tian Facebook AI Research {ljng, pascal, yann, yuandong}@fb.com |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | Yes | Code (in Py Torch) is available at https://github.com/facebookresearch/directclr |
| Open Datasets | Yes | We train a Sim CLR model (Chen et al. (2020a)) with a two-layer MLP projector. We followed the standard recipe and trained the model on Image Net for 100 epoch. |
| Dataset Splits | Yes | We evaluate the dimensionality by collecting the embedding vectors on the validation set. The spectrums are computed based on the output from the backbone, using Imgae Net validation set. |
| Hardware Specification | No | The batch size is 4096, which fits into 32 GPUs during training. However, specific GPU models or other detailed hardware specifications are not mentioned. |
| Software Dependencies | No | The paper mentions 'Code (in Py Torch)' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We use a LARS optimizer and train all models for 100 epochs. The batch size is 4096, which fits into 32 GPUs during training. The learning rate is 4.8 as in Sim CLR (Chen et al., 2020a), which goes through a 10 epoch of warming up and then a cosine decay schedule. |