reproducibilityindex.ai

OT-CLIP: Understanding and Generalizing CLIP via Optimal Transport

Authors: Liangliang Shi, Jack Fan, Junchi Yan

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show their superior performance on public datasets for downstream tasks in both image and text domain. ... The code is available at https://github.com/fan23j/ICML2024-OT-CLIP. ... 3.3. Experiments on CLIP Training ... 4.4. Experiments on CLIP Inference
Researcher Affiliation	Academia	1School of Artificial Intelligence & Department of Computer Science and Engineering & Mo E Lab of AI, Shanghai Jiao Tong University, Shanghai, China 2Department of Computer Science, University of North Carolina at Chapel Hill.
Pseudocode	Yes	Algorithm 1 Py Torch-style pseudocode for Entropic OT
Open Source Code	Yes	The code is available at https://github.com/fan23j/ICML2024-OT-CLIP.
Open Datasets	Yes	Our models are pretrained on the popular Conceptual Captions 3M (CC3M) (Sharma et al., 2018) image-text pairs and primarily evaluated on Image Net1K (Deng et al., 2009) zero-shot classification.
Dataset Splits	No	The paper mentions "validation sets" for generating long-tailed distributions but does not provide specific train/validation/test splits (e.g., percentages or counts) for their experiments, nor does it cite a predefined split used for validation.
Hardware Specification	Yes	For each experiment, we train for 30 epochs with a batch size of 256 across 4 32GB V100 GPUs for an effective batch size of 1024.
Software Dependencies	No	The paper provides 'Py Torch-style pseudocode' but does not specify version numbers for PyTorch, Python, or any other software libraries used.
Experiment Setup	Yes	For each experiment, we train for 30 epochs with a batch size of 256 across 4 32GB V100 GPUs for an effective batch size of 1024. We use learning rate lr = 5e-4 with the Adam W optimizer and weight decay of 0.1 in all our experiments.