reproducibilityindex.ai

LG-CAV: Train Any Concept Activation Vector with Language Guidance

Authors: Qihan Huang, Jie Song, Mengqi Xue, Haofei Zhang, Bingde Hu, Huiqiong Wang, Hao Jiang, Xingen Wang, Mingli Song

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on four datasets across nine architectures demonstrate that LG-CAV achieves significantly superior quality to previous CAV methods given any concept, and our model correction method achieves state-of-the-art performance compared to existing concept-based methods.
Researcher Affiliation	Collaboration	Qihan Huang1, Jie Song1, , Mengqi Xue2, Haofei Zhang1, Bingde Hu1, Huiqiong Wang3, Hao Jiang4, Xingen Wang1, 5, Mingli Song1 1 Zhejiang University, 2 Hangzhou City University 3 Ningbo Innovation Center, Zhejiang University 4 Alibaba Group, 5 Bangsheng Technology Co., Ltd. {qh.huang,sjie,haofeizhang,tonyhu,huiqiong_wang,newroot,brooksong}@zju.edu.cn mqxue@zucc.edu.cn, aoshu.jh@alibaba-inc.com
Pseudocode	No	The paper describes its methods through text and diagrams (e.g., Figure 3) but does not include formal pseudocode blocks or algorithms.
Open Source Code	Yes	Our code is available at https://github.com/hqh QAQ/LG-CAV.
Open Datasets	Yes	We estimate the quality of LG-CAV on the Broden dataset [2] (a popular concept-based dataset with 63,305 images for 1197 visual concepts). ... We employ our model correction method on three representative datasets: Image Net [5] (large-scale dataset), CUB-200-2011 [43] (a popular dataset used by many concept-based methods), and CIFAR-100 [18] (small-scale dataset).
Dataset Splits	No	The paper mentions using training and test sets but does not explicitly detail a separate validation set split or its use for hyperparameter tuning.
Hardware Specification	No	The paper states in the NeurIPS checklist that computer resources are provided in the appendix, but the appendix (B.6) only discusses standard deviations and does not specify hardware such as CPU/GPU models or memory.
Software Dependencies	No	We follow the original CAV work [16] to train CAVs for the target models pre-trained on Image Net (from the open-sourced Py Torch package [29]).
Experiment Setup	Yes	Parameters. To simulate the absence of images for training CAVs in reality, we set the number of positive samples (Pc) and negative samples (Nc) to be 10, and the remaining images will be used as the test set. The threshold ϵ for determining positively-related concept-class pair is 0.6. For each CAV method, we use SGD optimizer [34] to train the CAV for 10 epochs with a learning rate of 1e-3. R (the number of probe images) is set to be 1000. The loss function adopted here is Ltotal since Pc and Nc are available. ... We use SGD optimizer to train the final classification layer for 20 epochs with a learning rate of 1e-3.