ViTree: Single-Path Neural Tree for Step-Wise Interpretable Fine-Grained Visual Categorization

Authors: Danning Lao, Qi Liu, Jiazi Bu, Junchi Yan, Wei Shen

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Remarkably, extensive experimentation validates that this streamlined approach surpasses various strong competitors and achieves state-of-the-art performance while maintaining exceptional interpretability which is proved by multi-perspective methods.
Researcher Affiliation Academia Danning Lao, Qi Liu , Jiazi Bu , Junchi Yan, Wei Shen* Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University {laodanning, purewhite, bujiazi001, yanjunchi, wei.shen}@sjtu.edu.cn
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code can be found at https://github.com/SJTU-Deep Vision Lab/Vi Tree.
Open Datasets Yes We evaluated Vi Tree on two benchmark datasets: CUB-200-2011 (Wah et al. 2011) and Standford Cars (Krause et al. 2013).
Dataset Splits No The paper mentions using benchmark datasets CUB-200-2011 and Standford Cars, but does not explicitly provide the specific training, validation, and test dataset splits (e.g., percentages or sample counts) in the text. It refers to adopting methodologies from other papers.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments.
Software Dependencies No The paper mentions using components like Swin T, multi-head attention, and the Faiss library, but does not provide specific version numbers for any software dependencies required to replicate the experiment.
Experiment Setup No The paper describes some architectural details like a binary tree with depth d=6 and components of Node Layers, and states that hyperparameters like learning rate, weight decay, and dropout are determined via heuristic experience and grid search, but it does not provide their specific concrete values needed for reproduction.