CompRess: Self-Supervised Learning by Compressing Representations
Authors: Soroush Abbasi Koohpayegani, Ajinkya Tejankar, Hamed Pirsiavash
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments show that our compressed SSL models outperform state-of-the-art compression methods as well as state-of-the-art SSL counterparts using the same architecture on most downstream tasks. Our Alex Net model, compressed from Res Net-50x4 trained with Sim CLR method, outperforms standard supervised Alex Net model on linear evaluation (by 2 point), in nearest neighbor (by 9 points), and in cluster alignment evaluation (by 4 points). This is interesting as all parameters of the supervised model are already trained on the downstream task itself but the SSL model and its teacher have seen only Image Net without labels. To the best of our knowledge, this is the first time an SSL model performs better than the supervised one on the Image Net task itself instead of transfer learning settings. |
| Researcher Affiliation | Academia | Soroush Abbasi Koohpayegani Ajinkya Tejankar Hamed Pirsiavash University of Maryland, Baltimore County {soroush,at6,hpirsiav}@umbc.edu |
| Pseudocode | No | The paper describes the method using mathematical equations and textual explanations, but no explicit pseudocode or algorithm blocks are provided. |
| Open Source Code | Yes | Our code is available here: https://github.com/UMBCvision/Comp Ress |
| Open Datasets | Yes | We use Image Net (ILSVRC2012) [46] without labels for all self-supervised and compression methods, and use various datasets (Image Net, PASCAL-VOC [16], Places [66], CUB200 [55], and Cars196 [32]) for evaluation. |
| Dataset Splits | Yes | We treat the student as a frozen feature extractor and train a linear classifier on the labeled training set of Image Net and evaluate it on the validation set with Top-1 accuracy. ... We evaluate our representations by a NN classifier using only 1%, 10%, and only 1 sample per category of Image Net. The results are shown in Table 3. For 1-shot, Ours-2q model achieves an accuracy close to the supervised model which has seen all labels of Image Net in learning the features. |
| Hardware Specification | Yes | Compressing from Res Net50x4 to Res Net-50 takes ~100 hours on four Titan-RTX GPUs while compressing from Res Net-50 to Res Net-18 takes ~90 hours on two 2080-TI GPUs. |
| Software Dependencies | No | The paper mentions using 'Py Torch' and 'FAISS GPU library [1]' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We use Py Torch along with SGD (weight decay=1e 4, learning rate=0.01, momentum=0.9, epochs=130, and batch size=256). We multiply learning rate by 0.2 at epochs 90 and 120. ... We use memory bank size of 128, 000 and set moving average weight for key encoder to 0.999. We use the temperature of 0.04 for all experiments involving Sim CLR Res Net-50x4 and Mo Co Res Net-50 teachers. |