Your Contrastive Learning Is Secretly Doing Stochastic Neighbor Embedding
Authors: Tianyang Hu, Zhili LIU, Fengwei Zhou, Wenjia Wang, Weiran Huang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through comprehensive numerical experiments, we show that the modified t-Sim CLR outperforms the baseline with 90% less feature dimensions on CIFAR-10 and t-Mo Co-v2 pretrained on Image Net significantly outperforms in various domain transfer and OOD tasks. |
| Researcher Affiliation | Collaboration | Tianyang Hu1, Zhili Liu1,2, Fengwei Zhou1, Wenjia Wang2,3, Weiran Huang1, 4 1 Huawei Noah s Ark Lab, 2 Hong Kong University of Science and Technology 3 Hong Kong University of Science and Technology (Guangzhou) 4 Qing Yuan Research Institute, Shanghai Jiao Tong University |
| Pseudocode | No | The paper describes its methods and derivations mathematically and textually, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using the official implementation of Mo Co-v2 but does not provide a statement or link for the open-source code of their own proposed methods (t-Sim CLR or t-Mo Co-v2). |
| Open Datasets | Yes | CIFAR-10 (Krizhevsky, 2009) is a colorful image dataset with 50000 training samples and 10000 test samples from 10 categories. |
| Dataset Splits | Yes | We follow transfer settings in Ericsson et al. (2021) to finetune the pre-trained models. ... We also follow Ericsson et al. (2021) to split each dataset into training, validation, and test sets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, or memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using frameworks like Sim CLR and Mo Co-v2, and models like Res Net-18, but does not list specific software dependencies with their version numbers (e.g., Python version, PyTorch version, CUDA version). |
| Experiment Setup | Yes | We train each model with batch size of 256 and 200 epochs for quicker evaluation. For t-Sim CLR, without specifying otherwise, we grid search the tdf and τ with range {1, 2, 5, 10} and {1, 2, 5, 10} respectively. ... For Sim CLR, we use batch size of 512, learning rate of 0.3, temperature of 0.7, and weight dacay of 0.0001. For t-Sim CLR, we use batch size of 512, learning rate of 0.8, temperature of 10, weight dacay of 0.0002, and tdf =5. |