Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)

Authors: Usha Bhalla, Alex Oesterling, Suraj Srinivas, Flavio Calmon, Himabindu Lakkaraju

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments in Section 5 reveal that Sp Li CE recovers highly sparse1, interpretable representations with high performance on downstream tasks, while accurately capturing the semantics of the underlying inputs.
Researcher Affiliation Academia Usha Bhalla Harvard University a,b Alex Oesterling Harvard University b Suraj Srinivas Harvard University b Flavio P. Calmon Harvard University b Himabindu Lakkaraju Harvard University b,c a Kempner Institute for the Study of Natural & Artificial Intelligence b School of Engineering and Applied Sciences c Harvard Business School
Pseudocode No The paper describes the ADMM algorithm and its iterates using mathematical equations (e.g., wk+1 = arg min w (f(w) + ρ/2||wk zk + uk||2 2)), but it does not present them in a clearly labeled or formatted pseudocode or algorithm block.
Open Source Code Yes Code is provided at https://github.com/AI4LIFE-GROUP/Sp Li CE.
Open Datasets Yes Datasets. We use CIFAR100 [56], MIT States [57], Celeb A [58], MSCOCO [59], and Image Net Val [60] for our experiments with results for additional datasets in the Appendix (Section B.4)
Dataset Splits Yes To further verify that Sp Li CE allows for identification and tracking of distribution shift, we study the Waterbirds dataset, which is known to have differently balanced train, valodation, and test splits.
Hardware Specification Yes All experiments are able to be performed on a single A100 GPU to run fast inference with CLIP.
Software Dependencies No The paper mentions using 'sklearn' and 'Pytorch' for implementations (e.g., 'we use sklearn s [61] Lasso solver' and 'we implement the Alternating Direction Method of Multipliers (ADMM) algorithm in Pytorch'), but it does not specify exact version numbers for these software dependencies.
Experiment Setup Yes For all experiments involving concept decomposition, we use sklearn s [61] Lasso solver with a non-negativity flag and an l1 penalty that results in solutions with l0 norms of 5-20 (around 0.2-0.3 for most datasets)... In our experiments we set ρ = 5, and stop when tolerances ϵprim = ||xk+1 zk+1||2, ϵdual = ||ρ(zk+1 zk)||2 are less than 1e 4.