Self-Calibrated Tuning of Vision-Language Models for Out-of-Distribution Detection

Authors: Geng Yu, Jianing Zhu, Jiangchao Yao, Bo Han

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments and analyses have been conducted to characterize and demonstrate the effectiveness of the proposed SCT.
Researcher Affiliation Academia 1CMIC, Shanghai Jiao Tong University 2TMLR Group, Hong Kong Baptist University 3Shanghai A I Laboratory 4RIKEN Center for Advanced Intelligence Project
Pseudocode Yes Algorithm 1 Self-Calibrated Tuning(SCT)
Open Source Code Yes The code is publicly available at: https://github.com/tmlr-group/SCT.
Open Datasets Yes we adopt the Image Net-1K dataset [Deng et al., 2009] as the ID data. For OOD datasets, we adopt the same ones as in [Huang and Li, 2021], including subsets of i Naturalist [Van Horn et al., 2018], SUN [Xiao et al., 2010], Places [Zhou et al., 2017], and TEXTURE [Cimpoi et al., 2014].
Dataset Splits Yes For the few-shot training, we use 1, 2, 4, and 16 shots ID data for training, respectively, and evaluate models in the full test set. ... The evaluation is performed on the original validation set of Image Net-1k.
Hardware Specification Yes All experiments are conducted with multiple runs on NVIDIA Ge Force RTX 3090 GPUs with Python 3.8 and Py Torch 1.12.
Software Dependencies Yes All experiments are conducted with multiple runs on NVIDIA Ge Force RTX 3090 GPUs with Python 3.8 and Py Torch 1.12.
Experiment Setup Yes For the hyperparameter K in the surrogate OOD features extraction, we use 200 in all experiments... For SCT, we adopt λ = 0.4 under the 1-shot setting and λ = 0.2 under the 16-shot setting. We train the CLIP for 25 epochs with a learning rate of 0.002 and other hyperparameters (e.g. batch size=32, SGD optimizer and token lengths N=16) are the same as those of Co Op [Zhou et al., 2022a].