Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Expediting Contrastive Language-Image Pretraining via Self-Distilled Encoders
Authors: Bumsoo Kim, Jinhyung Kim, Yeonsik Jo, Seung Hwan Kim
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through our extensive experiments, we validate that there is a sweet spot between expedition and distillation where the partial view from the expedited online image encoder interacts complementarily with the momentum teacher. As a result, ECLIPSE outperforms its counterparts while achieving substantial acceleration in inference speed. |
| Researcher Affiliation | Industry | Bumsoo Kim*, Jinhyung Kim, Yeonsik Jo, Seung Hwan Kim LG AI Research *correspondence to: EMAIL |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | For implementation details, our work is built on top of the open-source SLIP codebase (Mu et al. 2021)1. For De CLIP (Li et al. 2022), we follow the implementation details of the official code release2. Footnotes link to 'https://github.com/facebookresearch/SLIP' and 'https://github.com/Sense-GVT/De CLIP', which are external codebases, not the authors' own code for ECLIPSE. |
| Open Datasets | Yes | we pretrain ECLIPSE on large-scale open-source datasets, CC (Conceptual Captions) 3M (Sharma et al. 2018) and YFCC (Yahoo Flickr Creative Commons) 15M (Thomee et al. 2016). |
| Dataset Splits | No | The paper mentions pretraining on CC3M and YFCC15M datasets and evaluating on downstream datasets, but it does not explicitly state the training, validation, and test splits for the pretraining datasets. |
| Hardware Specification | Yes | All of our models are pretrained in 16 A100 GPUs. |
| Software Dependencies | No | The paper mentions building on 'open-source SLIP codebase' and following 'official code release' for De CLIP, but it does not specify version numbers for Python, PyTorch, CUDA, or other specific software libraries. |
| Experiment Setup | Yes | All models are pretrained on the CC3M dataset with a learning rate 5e-4 for 40 epochs4. We use Îș=0.7 for EVi T with a Vi T-B/16 backbone." and "We use m = 0.994 in our experiments. |