K-LITE: Learning Transferable Visual Models with External Knowledge
Authors: Sheng Shen, Chunyuan Li, Xiaowei Hu, Yujia Xie, Jianwei Yang, Pengchuan Zhang, Zhe Gan, Lijuan Wang, Lu Yuan, Ce Liu, Kurt Keutzer, Trevor Darrell, Anna Rohrbach, Jianfeng Gao
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study the performance of K-LITE on two important computer vision problems, image classification and object detection, benchmarking on 20 and 13 different existing datasets, respectively. The proposed knowledge-augmented models show significant improvement in transfer learning performance over existing methods. |
| Researcher Affiliation | Collaboration | Microsoft \University of California, Berkeley |
| Pseudocode | No | The paper describes methods and processes but does not include any formally structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/microsoft/klite. |
| Open Datasets | Yes | We pre-train on Image Net-21K [15] and GCC [76, 11]/YFCC [84] datasets... Following GLIP [50], we pre-train on Object365 [75]... |
| Dataset Splits | No | The paper mentions training and testing datasets (Table 1) and evaluates performance metrics but does not explicitly specify validation dataset splits or their sizes in the main text. |
| Hardware Specification | No | The main text states that hardware specifications are provided in the Appendix, which is not available for analysis. |
| Software Dependencies | No | The paper mentions 'Spacy [32]' without a version number, and refers to other models/frameworks like CLIP, ALIGN, Uni CL, and GLIP, but does not provide specific version numbers for software dependencies or libraries used for implementation. |
| Experiment Setup | No | The main text states that training details including hyperparameters are provided in the Appendix, which is not available for analysis. |