Whitening for Self-Supervised Representation Learning

Authors: Aleksandr Ermolov, Aliaksandr Siarohin, Enver Sangineto, Nicu Sebe

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4. Experiments. In our experiments we use the following datasets. CIFAR-10 and CIFAR-100 (Krizhevsky & Hinton, 2009), two small-scale datasets composed of 32 × 32 images with 10 and 100 classes, respectively. Image Net (Deng et al., 2009), the well-known large-scale dataset with about 1.3M training images and 50K test images, spanning over 1000 classes.
Researcher Affiliation Academia Aleksandr Ermolov 1 Aliaksandr Siarohin 1 Enver Sangineto 1 Nicu Sebe 1 ... 1Department of Information Engineering and Computer Science (DISI), University of Trento, Italy.
Pseudocode No The paper includes diagrams illustrating the training procedure (Figure 2) and batch slicing (Figure 3), but it does not contain formal pseudocode or algorithm blocks.
Open Source Code Yes The source code of the method and of all the experiments is available at: https://github.com/htdt/ self-supervised.
Open Datasets Yes CIFAR-10 and CIFAR-100 (Krizhevsky & Hinton, 2009), two small-scale datasets composed of 32 x 32 images with 10 and 100 classes, respectively. Image Net (Deng et al., 2009), the well-known large-scale dataset with about 1.3M training images and 50K test images, spanning over 1000 classes. Tiny Image Net (Le & Yang, 2015), a reduced version of Image Net... STL-10 (Coates et al., 2011)... Image Net-100 (Tian et al., 2020a)...
Dataset Splits No The paper uses standard datasets like CIFAR-10 and ImageNet, but it does not explicitly provide the specific numerical training, validation, and test dataset splits (e.g., percentages or sample counts) within the text for their main self-supervised learning training.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions the use of the Adam optimizer and specific network architectures (ResNet), but it does not provide specific version numbers for software dependencies such as deep learning frameworks (e.g., TensorFlow, PyTorch) or Python.
Experiment Setup Yes For the small and medium size datasets, we use the Adam optimizer (Kingma & Ba, 2014). For all the compared methods (including ours), we use the same number of epochs and the same learning rate schedule. Specifically, for CIFAR-10 and CIFAR-100, we use 1,000 epochs with learning rate 3 × 10−3; for Tiny Image Net, 1,000 epochs with learning rate 2 × 10−3; for STL-10, 2,000 epochs with learning rate 2 × 10−3. We use learning rate warm-up for the first 500 iterations of the optimizer, and a 0.2 learning rate drop 50 and 25 epochs before the end. We use a mini-batch size of K = 1024 samples. The dimension of the hidden layer of the projection head g( ) is 1024. The weight decay is 10−6.