An Investigation into Whitening Loss for Self-supervised Learning

Authors: Xi Weng, Lei Huang, Lei Zhao, Rao Anwer, Salman H. Khan, Fahad Shahbaz Khan

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on Image Net classification and COCO object detection reveal that the proposed CW-RGP possesses a promising potential for learning good representations.
Researcher Affiliation Academia Xi Weng 1, Lei Huang 1,2, Lei Zhao 1, Rao Muhammad Anwer2, Salman Khan2, Fahad Shahbaz Khan2 1SKLSDE, Institute of Artificial Intelligence, Beihang University, Beijing, China 2Mohamed bin Zayed University of Artificial Intelligence, UAE
Pseudocode Yes We call our method as channel whitening with random group partition (CW-RGP), and provide the full algorithm and Py Torch-style code in supplementary materials.
Open Source Code Yes The code is available at https://github.com/winci-ai/CW-RGP.
Open Datasets Yes Experimental results on Image Net classification and COCO object detection reveal that the proposed CW-RGP possesses a promising potential for learning good representations. Evaluation for Classification We first conduct experiments on small and medium size datasets (including CIFAR-10, CIFAR-100 [28], STL-10 [10], Tiny Image Net [29] and Image Net [11]).
Dataset Splits Yes Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] We specify the training details in supplemental material
Hardware Specification No We run the experiments on one workstation with 4 GPUs. This statement does not provide specific model numbers or detailed specifications of the GPUs or the workstation.
Software Dependencies No The paper mentions using Adam optimizer but does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes The model is trained on CIFAR-10 for 200 epochs with batch size of 256 and standard data argumentation, using Adam optimizer. For more details of experimental setup please see supplementary materials. We then conduct experiments on large-scale Image Net, strictly following the setup of Sim Siam paper [8]. The results of baselines shown in Table 2 are mostly reported in [8], except that the result of W-MSE 4 is from the W-MSE paper [12] and we reproduce BYOL [16], Sw AV [4] and W-MSE 4 [12] under a batch size of 512 based on the same training and evaluation settings as in [8] for fairness.