Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Slimmed Asymmetrical Contrastive Learning and Cross Distillation for Lightweight Model Training
Authors: Jian Meng, Li Yang, Kyungmin Lee, Jinwoo Shin, Deliang Fan, Jae-sun Seo
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Compared to the So TA lightweight CL training (distillation) algorithms, SACL-XD achieves 1.79% Image Net-1K accuracy improvement on Mobile Net-V3 with 64 training FLOPs reduction. Code is available at https://github.com/mengjian0502/SACL-XD. Table 1: Image Net-1k test accuracy with linear evaluation protocal based on Mobile Net-V3 [22] trained by different contrastive learning/distillation methods. |
| Researcher Affiliation | Academia | Jian Meng , Li Yang , Kyungmin Lee , Jinwoo Shin , Deliang Fan , and Jae-sun Seo Cornell Tech, USA University of North Carolina at Charlotte, USA KAIST, South Korea John Hopkins University, USA EMAIL EMAIL EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1: Py Torch-style pseudocode for the proposed algorithm |
| Open Source Code | Yes | Code is available at https://github.com/mengjian0502/SACL-XD. |
| Open Datasets | Yes | We evaluate the performance of the proposed algorithm based on CNN encoders (Mobile Net [23, 22], Efficient Net [29], Res Net [20]) and Vi T [13] models on the Image Net-1K and Image Net-100 dataset. We also demonstrate the capability of the proposed method with tiny-sized Res Net on the small CIFAR dataset. (Table 8, 9, 10 detail augmentation for Image Net-1K, Image Net-100, CIFAR-10 respectively) |
| Dataset Splits | Yes | We follow the linear evaluation protocol on Image Net to evaluate the performance of the backbone trained by the proposed SACL and cross-distillation (XD) algorithm. We follow the data augmentation setup in [12] for the CIFAR-10 dataset. |
| Hardware Specification | Yes | Table 11: Training time comparison between the proposed method and the distillation-based CL... GPU Type A100 (80G) |
| Software Dependencies | No | The paper mentions 'Py Torch-style pseudocode' and uses 'LARS optimizer' but does not specify version numbers for PyTorch, Python, or other key software libraries. |
| Experiment Setup | Yes | Appendix A.3 Detailed Experimental Setup of Pre-training: The encoders (Mobile Net, Efficient Net, Res Net-50) are trained on Image Net-1K with 100/200/300 epochs from scratch with the proposed method. We set the batch to 256 with a learning rate = 0.8. We employ the LARS optimizer with weight decay set to 1.5e-6. We set the correlation weights λ to 0.005. The hidden layer dimension of the projector is 4096. The detailed data augmentation is summarized in Table 8 |