reproducibilityindex.ai

Trusted Fine-Grained Image Classification through Hierarchical Evidence Fusion

Authors: Zhikang Xu, Xiaodong Yue, Ying Lv, Wei Liu, Zihao Li

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we perform experiments on three FGIC benchmark datasets, CUB-200-2011 (CUB), FGVC-Aircraft (AIR) and Stanford Cars (CAR). For CUB, we follow the setting in (Chen et al. 2018) to organize the label hierarchy of bird to 200 species, 122 genera, 37 families, and 13 orders. For AIR and CAR, we follow the setting in (Chang et al. 2021) to organize the label hierarchy of airplane to 100 models, 70 families and 30 makers, and organize the label hierarchy of car to 196 car makers and 9 car types. For each dataset, we use the bottom two layers in the class label hierarchy to implement the evidence fusion for ﬁne-grained classiﬁcation. The experiments consist of three parts. The ﬁrst one is the ablation experiment to verify the improvements brought by the hierarchical evidence fusion for ﬁne-grained classiﬁcation. The second experiment is conducted to validate the superiority of the proposed method through comparing with other state-of-the-art FGIC methods. In the ﬁnal experiment, we provide representative cases to interpret how the proposed method enhances the trustworthiness of ﬁne-grained classiﬁcation.
Researcher Affiliation	Collaboration	Zhikang Xu1, Xiaodong Yue1, 2, 3 , Ying Lv1, Wei Liu4, Zihao Li1 1 School of Computer Engineering and Science, Shanghai University, Shanghai, China 2 Artiﬁcial Intelligence Institute of Shanghai University, Shanghai, China 3 VLN Lab, NAVI Med Tech Co., Ltd. Shanghai, China 4 College of Electronics and Information Engineering, Tongji University, Shanghai, China
Pseudocode	Yes	Algorithm 1: Trusted ﬁne-grained image classiﬁcation
Open Source Code	No	The paper does not provide any concrete access to source code (e.g., specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets	Yes	In this section, we perform experiments on three FGIC benchmark datasets, CUB-200-2011 (CUB), FGVC-Aircraft (AIR) and Stanford Cars (CAR).
Dataset Splits	No	The paper mentions a "training stage" but does not provide specific details about training, validation, or test dataset splits (e.g., exact percentages, sample counts, or citations to predefined splits).
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using "Res Net50 (He et al. 2016) as the backbone network" and "Stochastic Gradient Descent (SGD) with momentum=0.9 as optimizer", but it does not specify any software versions for these or other libraries/frameworks (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup	Yes	Implementation Details. We use Res Net50 (He et al. 2016) as the backbone network and add an activation layer on top of the last fully connected layer as evidence extractor to extract non-negative evidence. To reduce complexity, the three cascade experts of each evidence extractor share parameters. The hyperparameter ε is set to 0.5 for CUB and AIR. Since the CAR dataset has only 9 car types, we do not use ε to reduce experts when training the evidence extractor for car types. For fair comparison, we use only regular data augmentation, enabling the ﬁne-tuned Res Net50 to achieve 84.6% accuracy on CUB. We use Stochastic Gradient Descent (SGD) with momentum=0.9 as optimizer. The learning rate is 0.001 and multiplied by 0.1 after 15, 30 and 45 epochs. The batch size is set to 6.