reproducibilityindex.ai

NeSyFOLD: A Framework for Interpretable Image Classification

Authors: Parth Padalkar, Huaduo Wang, Gopal Gupta

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluation is done on datasets with varied complexity and sizes. We conducted experiments to address the following questions: Table 1: Comparison of ERIC vs Ne Sy FOLD (NF) vs Ne Sy FOLD-EBP (NF-E).
Researcher Affiliation	Academia	Parth Padalkar, Huaduo Wang, Gopal Gupta The University of Texas at Dallas {parth.padalkar, huaduo.wang, gupta}@utdallas.edu
Pseudocode	Yes	Algorithm 1: Semantic labeling
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	We selected subsets of 2, 3, 5 and 10 classes from the Places (Zhou et al. 2017a) dataset which has images of various scenes and the German Traffic Sign Recognition Benchmark (GTSRB) (Stallkamp et al. 2012) dataset which consists of images of various traffic signposts. We use the ADE20k dataset (Zhou et al. 2017b) in our experiments.
Dataset Splits	Yes	Each class has 5k images of which we made a 4k/1k train-test split for each class and we used the given validation set as it is. The GTSRB (GT43) dataset has 43 classes of signposts. We used the given test set of 12.6k images as it is and did an 80 : 20 train-validation split which gave roughly 21k images for the train set and 5k for the validation set.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	The paper mentions software components like "Adam optimizer", "FOLD-SE-M rule interpreter", and "s(CASP) ASP solver" but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We employed a VGG16 CNN pretrained on Imagenet (Deng et al. 2009), training over 100 epochs with batch size 32. The Adam (Kingma and Ba 2014) optimizer was used, accompanied by class weights to address data imbalance. L2 Regularization of 0.005 spanning all layers, and a learning rate of 5 10 7 was adopted. A decay factor of 0.5 with a 10-epoch patience was implemented. Images were resized to 224 224, and hyperparameters α and γ (eq. (3)) for calculating threshold for binarization of kernels, were set at 0.6 and 0.7 respectively. We re-trained each of the trained CNN models again for each dataset, for 50 epochs using EBP. We used K = 20 for P10 and GT43 because of their larger size and K = 5 for all the other datasets. We used λ = 0.005 for all datasets.