Your Contrastive Learning Is Secretly Doing Stochastic Neighbor Embedding

Authors: Tianyang Hu, Zhili LIU, Fengwei Zhou, Wenjia Wang, Weiran Huang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through comprehensive numerical experiments, we show that the modified t-Sim CLR outperforms the baseline with 90% less feature dimensions on CIFAR-10 and t-Mo Co-v2 pretrained on Image Net significantly outperforms in various domain transfer and OOD tasks.
Researcher Affiliation Collaboration Tianyang Hu1, Zhili Liu1,2, Fengwei Zhou1, Wenjia Wang2,3, Weiran Huang1, 4 1 Huawei Noah s Ark Lab, 2 Hong Kong University of Science and Technology 3 Hong Kong University of Science and Technology (Guangzhou) 4 Qing Yuan Research Institute, Shanghai Jiao Tong University
Pseudocode No The paper describes its methods and derivations mathematically and textually, but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper mentions using the official implementation of Mo Co-v2 but does not provide a statement or link for the open-source code of their own proposed methods (t-Sim CLR or t-Mo Co-v2).
Open Datasets Yes CIFAR-10 (Krizhevsky, 2009) is a colorful image dataset with 50000 training samples and 10000 test samples from 10 categories.
Dataset Splits Yes We follow transfer settings in Ericsson et al. (2021) to finetune the pre-trained models. ... We also follow Ericsson et al. (2021) to split each dataset into training, validation, and test sets.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, or memory) used to run the experiments.
Software Dependencies No The paper mentions using frameworks like Sim CLR and Mo Co-v2, and models like Res Net-18, but does not list specific software dependencies with their version numbers (e.g., Python version, PyTorch version, CUDA version).
Experiment Setup Yes We train each model with batch size of 256 and 200 epochs for quicker evaluation. For t-Sim CLR, without specifying otherwise, we grid search the tdf and τ with range {1, 2, 5, 10} and {1, 2, 5, 10} respectively. ... For Sim CLR, we use batch size of 512, learning rate of 0.3, temperature of 0.7, and weight dacay of 0.0001. For t-Sim CLR, we use batch size of 512, learning rate of 0.8, temperature of 10, weight dacay of 0.0002, and tdf =5.