reproducibilityindex.ai

Post-hoc Concept Bottleneck Models

Authors: Mert Yuksekgonul, Maggie Wang, James Zou

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the PCBM and PCBM-h in challenging image classification and medical settings, demonstrating several use cases for PCBMs. We further address practical concerns and show that PCBMs can be used without a loss in the original model performance. We used the following datasets to systematically evaluate the PCBM and PCBM-h: CIFAR10, CIFAR100 (Krizhevsky et al., 2009) ... In Table 1, we report results over these five datasets.
Researcher Affiliation	Academia	Mert Yuksekgonul, Maggie Wang, James Zou Stanford University {merty,maggiewang,jamesz}@stanford.edu
Pseudocode	No	No explicit pseudocode or algorithm blocks are provided. The methodology is described in prose and mathematical equations (e.g., Equation 1 and 2).
Open Source Code	Yes	The code for our paper can be found in https://github.com/mertyg/post-hoc-cbm.
Open Datasets	Yes	We used the following datasets to systematically evaluate the PCBM and PCBM-h: CIFAR10, CIFAR100 (Krizhevsky et al., 2009)... CUB (Wah et al., 2011)... HAM10000 (Tschandl et al., 2018)... SIIM-ISIC (Rotemberg et al., 2021).
Dataset Splits	Yes	We tune the regularization strength on a subset of the training set, that is kept as a validation set.
Hardware Specification	Yes	We trained all our models on a single NVIDIA-Titan Xp gpu.
Software Dependencies	No	PCBMs are fitted using scikit-learn s SGDClassifier class, with 5000 maximum steps. Hybrid parts are trained with Py Torch, where we used Adam as the optimizer with 0.01 learning rate, with 0.01 L2 regularization on the residual classifier weights, and trained for 10 epochs.
Experiment Setup	Yes	Hyperparameters: In all of our experiments Elastic Net sparsity ratio parameter was α = 0.99. We trained all our models on a single NVIDIA-Titan Xp gpu. All of the models were trained for a total number of 10 epochs. We tune the regularization strength on a subset of the training set, that is kept as a validation set. PCBMs are fitted using scikit-learn s SGDClassifier class, with 5000 maximum steps. Hybrid parts are trained with Py Torch, where we used Adam as the optimizer with 0.01 learning rate, with 0.01 L2 regularization on the residual classifier weights, and trained for 10 epochs.