FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in RKHSs

Authors: Sepehr Dehdashtian, Lan Wang, Vishnu Boddeti

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Fairer CLIP on datasets with spurious correlation and intrinsic dependence and compare it to several existing baselines.
Researcher Affiliation Academia Michigan State University {sepehr, wanglan3, vishnu}@msu.edu
Pseudocode Yes Details of training Fairer CLIP are presented in Algorithm 1.
Open Source Code No The paper does not provide any explicit statement about open-sourcing code for the described methodology, nor does it include a link to a code repository.
Open Datasets Yes We evaluate Fairer CLIP on an assortment of classification tasks across many datasets. This includes Waterbirds (Sagawa et al., 2019), which contains spurious correlations between the types of birds and background of the images, different settings of Celeb A (Liu et al., 2015) that contains more than 200,000 face images of the celebrities in the wild annotated with 40 binary attributes and contains both spurious correlations and intrinsic dependencies among its attributes, Fair Face dataset (Karkkainen & Joo, 2021) which contains more than 108,000 face images from 7 different race groups (White, Black, Indian, East Asian, Southeast Asian, Middle Eastern, and Latino) collected from the YFCC-100M Flickr dataset and labeled with race, sex, and age groups, and Chicago Face Database (CFD) (Ma et al., 2015) which includes face images with different annotations such as facial attributes, ethnicity, age, and sex.
Dataset Splits Yes For Celeb A and Waterbirds, we follow their official train/val/test splits and only use ground truth labels from the val split for hyperparameter tuning. For CFD, since there is no official dataset split, we randomly split it with a ratio of 0.5/0.1/0.4 for train/val/test.
Hardware Specification No The paper states: 'The underlying model for this experiment is CLIP Vi T-L/14, and all the numbers are measured on the same machine.' (Section 4.3). However, it does not specify the particular model of GPU, CPU, or any other specific hardware component used.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies, programming languages, or libraries used in the experiments.
Experiment Setup Yes Following the standard setting Zhang & Ré (2022), we use val split to decide the optimal τ, τz, and dimensionality of the random Fourier features (RFF). For Celeb A, the optimal τ, τz, and RFF dimensions are 0.8, 0.5, and 8000. For Waterbirds, the optimal τ, τz, and RFF dimensions are 0.7, 0.7, and 3000. And for CFD, the optimal τ, τz, and RFF dimensions are 0.6, 0.3, and 1000. In the scenario where the group labels are not available, we follow the same setup as the scenario where the group labels of the val split are available. For all the above-mentioned experiments under different settings, we set the representation dimensionality r to c 1 where c is the number of classes of the downstream target task.