Auxiliary Losses for Learning Generalizable Concept-based Models

Authors: Ivaxi Sheth, Samira Ebrahimi Kahou

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This paper presents extensive experiments on real-world datasets for image classification tasks, namely CUB, Aw A2, Celeb A and TIL. We also study the performance of coop-CBM models under various distributional shift settings. We show that our proposed method achieves higher accuracy in all distributional shift settings even compared to the black-box models with the highest concept accuracy.
Researcher Affiliation Academia Ivaxi Sheth CISPA-Helmholtz Center for Information Security ivaxi.sheth@cispa.de Samira Ebrahimi Kahou École de technologie supérieure, Mila, CIFAR AI Chair samira.ebrahimi-kahou@etsmtl.ca
Pseudocode Yes Algorithm 1 Intervention selector Pseudocode
Open Source Code Yes Our codebase is available at https://github.com/ivaxi0s/coop-cbm and is built upon from open source repos [27, 41].
Open Datasets Yes We use Caltech-UCSD Birds-200-2011 (CUB) [55] dataset for the task of bird identification. We additionally use Animals with Attributes 2 (Aw A2) [57] dataset for the task of animal classification. We use all of the subsets of the Tumor-Infiltrating Lymphocytes (TIL) [42] dataset for cancer cell classification. For m-Celeb A[31] dataset, we train using 64 batch size with Adam optimizer with 0.9 momentum and learning rate of 5 10 3 for 500 epochs. The feature extractor was Inception V3[50] as a concept encoder model.
Dataset Splits Yes We use a traditional 70%-10%-20% random split for training, validation, and testing datasets.
Hardware Specification Yes We trained on Linux-based clusters mainly on V100 GPUs and partially on A100 GPU.
Software Dependencies No The paper mentions using specific models (e.g., Inception V3, VIT) and optimizers (SGD, Adam) but does not provide specific version numbers for the software frameworks or libraries used for implementation (e.g., PyTorch, TensorFlow versions).
Experiment Setup Yes For CUB[55] dataset, we trained using 128 batch size with SGD optimizer with 0.9 momentum and learning rate of 10 2. The feature extractor was Inception V3[50] as a concept encoder model. ... Across all of the models for tasks, we use weight decay of factor of 5 10 5 and scale the learning rate by a factor of 0.1 if no improvement has been seen in validation loss for the last 15 epochs during training. We also train using an early stopping mechanism i.e. if the validation loss does not improve for 200 epochs, we stop training.