Continual Vision-Language Representation Learning with Off-Diagonal Information

Authors: Zixuan Ni, Longhui Wei, Siliang Tang, Yueting Zhuang, Qi Tian

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on commonly used datasets with different scales and scopes have demonstrated the effectiveness of our method.
Researcher Affiliation Collaboration 1Zhejiang University 2Huawei Cloud.
Pseudocode No The paper describes its method using text and mathematical equations (e.g., equations 1-5) but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code No The paper mentions 'Open AI source code (Open AI)' in Section 3 for CLIP baseline setup, but it does not include an explicit statement from the authors about releasing their own code for the Mod-X framework or a direct link to their repository.
Open Datasets Yes MS COCO Captions (Lin et al., 2014): MS COCO Captions (COCO) is a widely used image caption dataset. ... Flickr30K (Young et al., 2014): Flickr30K contains 30K training images... ECommerce-T2I (Yang et al., 2021) is a text-to-image e-commerce dataset...
Dataset Splits Yes MS COCO Captions (Lin et al., 2014): MS COCO Captions (COCO) is a widely used image caption dataset. It contains 80K training images, 30K validation images, and 5K testing images (COCO(5K)).
Hardware Specification Yes All of the experiments are conducted on 8 NVIDIA V100 GPUS.
Software Dependencies No The paper mentions software components like 'Adam W' optimizer and refers to 'Open AI source code' for CLIP, but it does not specify version numbers for programming languages, libraries, or frameworks (e.g., 'PyTorch 1.9', 'Python 3.8').
Experiment Setup Yes In exploration experiments 3 and Experiment 5.2, we use the hyper-parameters as be shown in table 3(a). Since the experiment 5.3 based on the pre-training model Vi T-32/B in (Open AI), we set a smaller learning rate from 5e-4 to 1e-6. And other hyper-parameters is consistent with Experiment 5.2 and CLIP (Open AI).