Concept-level Debugging of Part-Prototype Networks

Authors: Andrea Bontempelli, Stefano Teso, Katya Tentori, Fausto Giunchiglia, Andrea Passerini

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental evaluation shows that Proto PDebug outperforms state-of-the-art debuggers for a fraction of the annotation cost. An online experiment with laypeople confirms the simplicity of the feedback requested to the users and the effectiveness of the collected feedback for learning confounder-free part-prototypes.
Researcher Affiliation Academia Andrea Bontempelli1, Stefano Teso2,1, Katya Tentori2,3, Fausto Giunchiglia1, Andrea Passerini1 1 DISI 2 CIMe C 3 DIPSCO University of Trento name.surname@unitn.it
Pseudocode Yes Algorithm 1 A Proto PDebug debugging session; f is a Proto PNet trained on data set D.
Open Source Code Yes The full experimental setup is published on Git Hub at https://github.com/abonte/protopdebug and in the Supplementary Material together with additional implementation details.
Open Datasets Yes We modified the CUB200 data set (Wah et al., 2011)... The COVID images are processed using (cod, 2021a). The images are downloaded from Git Hub-COVID repository (Cohen et al., 2020), Chest X-ray14 repository (Wang et al., 2017), Pad Chest (Bustos et al., 2020) and BIMCV-COVID19+ (Vay a et al., 2020).
Dataset Splits No Table 1 reports statistics for all data sets used in our experiments, including the number of training and test examples, training examples used for the visualization step, and classes. A separate validation split is not explicitly provided.
Hardware Specification Yes All experiments were implemented in Python 3 using Pytorch (Paszke et al., 2019) and run on a machine with two Quadro RTX 5000 GPUs.
Software Dependencies No The paper mentions 'Python 3' and 'Pytorch (Paszke et al., 2019)' but does not provide specific version numbers for PyTorch or other key software libraries.
Experiment Setup Yes The values of λf and a were set to 100 and 5, respectively, and λiaia to 0.001... The embedding layers were implemented using a pre-trained VGG-16, allocating two prototypes for each class. The training batch size is set to 20 for the experiments on CUB5box and COVID data sets, and 128 on CUB5nat. The learning rate of the prototype layer was scaled by 0.15 every 4 epochs. The width of the activation function (Eq. 1 in the main paper) used in ℓfor was set to ϵ = 10 8 in all experiments. For the COVID data set, λcls, λsep, λfor and λrem were set to 0.5, 0.08, 200, and 0 respectively for the first two debugging rounds. In the last round, λcls, λsep were decreased to 25% of the above values and λrem increased to 0.01.