Supervised Contrastive Learning
Authors: Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, Dilip Krishnan
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On Res Net-200, we achieve top-1 accuracy of 81.4% on the Image Net dataset, which is 0.8% above the best number reported for this architecture. We show consistent outperformance over cross-entropy on other datasets and two Res Net variants. |
| Researcher Affiliation | Collaboration | Prannay Khosla Google Research Piotr Teterwak Boston University Chen Wang Snap Inc. Aaron Sarna Google Research Yonglong Tian MIT Phillip Isola MIT Aaron Maschinot Google Research Ce Liu Google Research Dilip Krishnan Google Research |
| Pseudocode | No | The paper describes the contrastive loss functions with mathematical equations and describes the representation learning framework components, but it does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our loss function is simple to implement and reference Tensor Flow code is released at https://t.ly/supcon 1. Py Torch implementation: https://github.com/Hobbit Long/Sup Contrast |
| Open Datasets | Yes | We evaluate our Sup Con loss (Lsup out, Eq. 2) by measuring classification accuracy on a number of common image classification benchmarks including CIFAR-10 and CIFAR-100 [27] and Image Net [7]. |
| Dataset Splits | Yes | We evaluate our Sup Con loss (Lsup out, Eq. 2) by measuring classification accuracy on a number of common image classification benchmarks including CIFAR-10 and CIFAR-100 [27] and Image Net [7]. |
| Hardware Specification | Yes | On Image Net, with a memory size of 8192 (requiring only the storage of 128-dimensional vectors), a batch size of 256, and SGD optimizer, running on 8 Nvidia V100 GPUs, Sup Con is able to achieve 79.1% top-1 accuracy on Res Net-50. |
| Software Dependencies | No | The paper mentions 'reference Tensor Flow code' and a 'Py Torch implementation' but does not specify version numbers for these software dependencies or any other libraries. |
| Experiment Setup | Yes | The Sup Con loss was trained for 700 epochs during pretraining for Res Net-200 and 350 epochs for smaller models. We trained our models with batch sizes of up to 6144... All our results used a temperature of τ = 0.1. We experimented with standard optimizers such as LARS [58], RMSProp [20] and SGD with momentum [39]... |