reproducibilityindex.ai

FILIP: Fine-grained Interactive Language-Image Pre-Training

Authors: Lewei Yao, Runhui Huang, Lu Hou, Guansong Lu, Minzhe Niu, Hang Xu, Xiaodan Liang, Zhenguo Li, Xin Jiang, Chunjing Xu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 EXPERIMENTS and Experiments show that FILIP achieves state-of-the-art performance on multiple downstream vision-language tasks including zero-shot image classiﬁcation and image-text retrieval.
Researcher Affiliation	Collaboration	1Huawei Noah s Ark Lab, 2Hong Kong University of Science and Technology 3Sun Yat-sen University
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using 'the LAMB optimizer implemented by the cybertronai s open-source repository (https: //github.com/cybertronai/pytorch-lamb)' but does not state that the code for FILIP itself is open-source or provide a link to it.
Open Datasets	Yes	We also use 3 public datasets, including Conceptual Captions 3M (CC3M) (Sharma et al., 2018), Conceptual 12M (CC12M) (Changpinyo et al., 2021) and Yahoo Flickr Creative Commons 100M (YFCC100M) (Thomee et al., 2016).
Dataset Splits	No	The paper describes training and test sets for evaluation but does not explicitly provide details about a dedicated validation dataset split for hyperparameter tuning or early stopping.
Hardware Specification	Yes	The training is mainly conducted on Nvidia V100 GPUs and Ascend Cards.
Software Dependencies	No	The paper mentions software like 'LAMB optimizer', 'scikit-learn', and 'pytorch-based codebase', but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	Table 8 summarizes the common hyperparameters and Table 9 shows the model- and dataset-speciﬁc hyperparameters for FILIP pre-training. Table 10 shows the hyperparameters for image-text retrieval ﬁne-tuning. Table 13 shows the hyperparameters used in linear probe on Image Net.