reproducibilityindex.ai

Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off

Authors: Mateo Espinosa Zarlenga, Pietro Barbiero, Gabriele Ciravegna, Giuseppe Marra, Francesco Giannini, Michelangelo Diligenti, Zohreh Shams, Frederic Precioso, Stefano Melacci, Adrian Weller, Pietro Lió, Mateja Jamnik

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that Concept Embedding Models (1) attain better or competitive task accuracy w.r.t. standard neural models without concepts, (2) provide concept representations capturing meaningful semantics including and beyond their ground truth labels, (3) support test-time concept interventions whose effect in test accuracy surpasses that in standard concept bottleneck models, and (4) scale to real-world conditions where complete concept supervisions are scarce.
Researcher Affiliation	Collaboration	Mateo Espinosa Zarlenga University of Cambridge me466@cam.ac.uk Pietro Barbiero University of Cambridge pb737@cam.ac.uk Gabriele Ciravegna Université Côte d Azur, Inria, CNRS, I3S, Maasai, Nice, France gabriele.ciravegna@inria.fr Giuseppe Marra KU Leuven giuseppe.marra@kuleuven.be Francesco Giannini University of Siena francesco.giannini@unisi.it Michelangelo Diligenti University of Siena diligmic@diism.unisi.it Zohreh Shams Babylon Health University of Cambridge zs315@cam.ac.uk Frederic Precioso Université Côte d Azur, Inria, CNRS, I3S, Maasai, Nice, France fprecioso@unice.fr Stefano Melacci University of Siena mela@diism.unisi.it Adrian Weller University of Cambridge Alan Turing Institute aw665@cam.ac.uk Pietro Lio University of Cambridge pl219@cam.ac.uk Mateja Jamnik University of Cambridge mateja.jamnik@cl.cam.ac.uk
Pseudocode	No	The paper describes the architecture and methods in prose and diagrams (e.g., Figure 2) but does not include formal pseudocode or an algorithm block.
Open Source Code	Yes	We uploaded a zip file with our code and documentation in the supplemental material and made our code available in a public repository4. 4https://github.com/mateoespinosa/cem/
Open Datasets	Yes	Furthermore, we evaluate our methods on two real-world image tasks: the Caltech-UCSD Birds-200-2011 dataset (CUB, [16]), preprocessed as in [9], and the Large-scale Celeb Faces Attributes dataset (Celeb A, [25]).
Dataset Splits	Yes	For CUB, we use a 70/15/15 train/val/test split and for Celeb A, a 80/10/10 train/val/test split.
Hardware Specification	Yes	Computational resources We used a NVIDIA DGX Station with 8 NVIDIA V100 GPUs and 256GB of RAM.
Software Dependencies	Yes	We implemented our models in PyTorch [39] (version 1.10.0), using scikit-learn [40] (version 1.0.2) for the k-Medoids clustering algorithm.
Experiment Setup	Yes	All models were trained for 250 epochs using Adam [38] with a learning rate of 1e-3 and a batch size of 256.