reproducibilityindex.ai

Human-Driven FOL Explanations of Deep Learning

Authors: Gabriele Ciravegna, Francesco Giannini, Marco Gori, Marco Maggini, Stefano Melacci

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Different typologies of explanations are evaluated in distinct experiments, showing that the proposed approach discovers new knowledge and can improve the classiﬁer performance. Experiments are collected in Section 4 and Section 5 concludes the paper.
Researcher Affiliation	Academia	Gabriele Ciravegna1,2 , Francesco Giannini2 , Marco Gori2,3 Marco Maggini2 and Stefano Melacci2 1Department of Information Engineering, University of Florence, Florence, Italy 2SAILab, Department of Information Engineering and Mathematics, University of Siena, Siena, Italy 3Maasai, Universit e Cˆote d Azur, Nice, France
Pseudocode	No	The paper describes its procedure in text but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper provides links to datasets used (PASCAL-Part and Celeb A) but does not provide a link or explicit statement about releasing the source code for their proposed methodology.
Open Datasets	Yes	We considered two different tasks, the joint recognition of objects and objects parts in the PASCAL-Part dataset, and the recognition of face attributes in portrait images of the Celeb A data. PASCAL-Part: https://www.cs.stanford.edu/ roozbeh/pascal-parts/pascal-parts.html. Celeb A: http://mmlab.ie.cuhk.edu.hk/projects/Celeb A.html
Dataset Splits	Yes	Each dataset was divided into training, validation, test sets, and we report the (macro) F1 scores measured on the test data. (...) We divided them into three splits, composed of 9,092 training images, 505 validation images, 506 test images, respectively (keeping the original class distribution). (...) The dataset is composed of over 200k images of celebrity faces, out of which 45% are used as training data, 5% as validation data and 100k are used for testing.
Hardware Specification	No	The paper describes the datasets, models, and training procedures but does not specify any particular hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	No	The paper mentions using an 'Adam optimizer' and 'Res Net50 backbone network' but does not specify version numbers for any software libraries, frameworks, or dependencies used in the experiments.
Experiment Setup	Yes	According to Section 3.3, we set E = 25, and then 4 learning stages (D = 4) are performed, each of them composed of Nf = 25 epochs for the f-network (stage > 1) and Nψ = 10 epochs for the ψ-network. For a fair comparison, the baseline classiﬁer is trained for 100 epochs. Each neuron is forced to keep only q = 2 input connections in the ψ-network. All the main hyperparameters (weights of terms composing the learning criteria of Section 3.1, initial learning rate (Adam optimizer, mini-batch-based stochastic gradient), contribute of the weight decay) have been chosen through a grid search procedure, with values ranging in [10 1, 10 4], selecting the model that returned the best accuracy on a heldout validation set.