Auxiliary Losses for Learning Generalizable Concept-based Models
Authors: Ivaxi Sheth, Samira Ebrahimi Kahou
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper presents extensive experiments on real-world datasets for image classification tasks, namely CUB, Aw A2, Celeb A and TIL. We also study the performance of coop-CBM models under various distributional shift settings. We show that our proposed method achieves higher accuracy in all distributional shift settings even compared to the black-box models with the highest concept accuracy. |
| Researcher Affiliation | Academia | Ivaxi Sheth CISPA-Helmholtz Center for Information Security ivaxi.sheth@cispa.de Samira Ebrahimi Kahou École de technologie supérieure, Mila, CIFAR AI Chair samira.ebrahimi-kahou@etsmtl.ca |
| Pseudocode | Yes | Algorithm 1 Intervention selector Pseudocode |
| Open Source Code | Yes | Our codebase is available at https://github.com/ivaxi0s/coop-cbm and is built upon from open source repos [27, 41]. |
| Open Datasets | Yes | We use Caltech-UCSD Birds-200-2011 (CUB) [55] dataset for the task of bird identification. We additionally use Animals with Attributes 2 (Aw A2) [57] dataset for the task of animal classification. We use all of the subsets of the Tumor-Infiltrating Lymphocytes (TIL) [42] dataset for cancer cell classification. For m-Celeb A[31] dataset, we train using 64 batch size with Adam optimizer with 0.9 momentum and learning rate of 5 10 3 for 500 epochs. The feature extractor was Inception V3[50] as a concept encoder model. |
| Dataset Splits | Yes | We use a traditional 70%-10%-20% random split for training, validation, and testing datasets. |
| Hardware Specification | Yes | We trained on Linux-based clusters mainly on V100 GPUs and partially on A100 GPU. |
| Software Dependencies | No | The paper mentions using specific models (e.g., Inception V3, VIT) and optimizers (SGD, Adam) but does not provide specific version numbers for the software frameworks or libraries used for implementation (e.g., PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For CUB[55] dataset, we trained using 128 batch size with SGD optimizer with 0.9 momentum and learning rate of 10 2. The feature extractor was Inception V3[50] as a concept encoder model. ... Across all of the models for tasks, we use weight decay of factor of 5 10 5 and scale the learning rate by a factor of 0.1 if no improvement has been seen in validation loss for the last 15 epochs during training. We also train using an early stopping mechanism i.e. if the validation loss does not improve for 200 epochs, we stop training. |