Zero-Mean Regularized Spectral Contrastive Learning: Implicitly Mitigating Wrong Connections in Positive-Pair Graphs

Authors: Xiong Zhou, Xianming Liu, Feilong Zhang, Gang Wu, Deming Zhai, Junjun Jiang, Xiangyang Ji

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we provide extensive experiments on the tasks of contrastive learning, supervised classification, unsupervised domain adaptation, and learning with noisy labels to verify the effectiveness of zero-mean regularization on several benchmark datasets with Res Nets [21, 22]. More experimental results and details can be found in the Appendix B.
Researcher Affiliation Academia Xiong Zhou, Xianming Liu , Feilong Zhang, Gang Wu, Deming Zhai & Junjun Jiang Faculty of Computing, Harbin Institute of Technology {cszx,csxm,flzhang,gwu,zhaideming,jiangjunjun}@hit.edu.cn Xi ang-yang Ji Department of Automation, Tsinghua University xyji@tsinghua.edu.cn
Pseudocode Yes Table 7: Py Torch-like pseudocode of spectral contrastive loss and its supervised version.
Open Source Code No The paper does not contain an explicit statement offering open-source code for the methodology or a link to a code repository.
Open Datasets Yes In our experiments, we consider several commonly-used datasets, namely CIFAR10/-100 [27], SVHN (S) [37], MNIST (M) [28], USPS (U) [25], and MNIST-M (M-M) [14]. For the tasks of self-supervised contrastive learning and supervised classification, we utilize CIFAR-10/-100 and SVHN datasets. These datasets are also employed for the evaluation of learning with noisy labels. To investigate unsupervised domain adaptation, we consider four domains of digit datasets: SVHN, MNIST, USPS, and MNIST-M.
Dataset Splits Yes Experimental Details. We conduct experiments on self-supervised and supervised spectral contrastive learning using Pre Act-Res Net-18 models pre-trained on CIFAR-10, CIFAR-100, SVHN, and a subset of Image Net comprising the first 100 classes. The training process involves training the networks for 200 epochs. To evaluate the performance of these models in linear classification, we employ an independent linear classifier on the fixed representations obtained during contrastive pre-training. This classifier is trained using labeled data. The reported results correspond to the top-1 accuracy achieved by these models. Table 1: Top-1 liner probing (%) and Top-1 validation accuracy (%) of self-supervised learning and supervised learning with the spectral contrastive loss in Equation B.1, respectively.
Hardware Specification Yes All experiments were implemented using Py Torch and executed on NVIDIA GTX 2080Ti GPUs.
Software Dependencies No The paper states: "All experiments were implemented using Py Torch" (Appendix B.2), but does not specify the version number of Py Torch or any other software dependencies with their specific version numbers.
Experiment Setup Yes The training process utilizes the SGD optimizer with a momentum of 0.9, a learning rate of 0.1, and a weight decay of 5e-4. To dynamically adjust the learning rate throughout the 200 epochs, we use the cosine decay learning rate schedule [35]. For datasets with smaller sizes, we set the batch size to 512 on 1 GPU, while for Image Net-100 and Domain Net, the batch size was set to 1536 on 8 GPUs. All experiments were implemented using Py Torch and executed on NVIDIA GTX 2080Ti GPUs.