Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification

Authors: Muhammad Ferjad Naeem, Yongqin Xian, Luc V Gool, Federico Tombari

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Quantitatively, we demonstrate that our I2DFormer significantly outperforms previous unsupervised semantic embeddings under both zero-shot and generalized zero-shot learning settings on three public datasets.We conduct extensive experiments on Animals with Attributes2 (AWA2) [57], Caltech-UCSD Birds (CUB) [51] and Oxford Flowers (FLO) [32], which are widely used datasets in ZSL.
Researcher Affiliation Collaboration Muhammad Ferjad Naeem1 Yongqin Xian1 Luc Van Gool1 Federico Tombari2,3 1 ETH Zรผrich 2 TUM 3Google
Pseudocode No The paper presents architectural diagrams and describes methods in text, but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Code available at https://github.com/ferjad/I2DFormer.
Open Datasets Yes We conduct extensive experiments on Animals with Attributes2 (AWA2) [57], Caltech-UCSD Birds (CUB) [51] and Oxford Flowers (FLO) [32], which are widely used datasets in ZSL.
Dataset Splits Yes We follow the evaluation protocol and data splits proposed by Xian et al. [57].
Hardware Specification Yes We implement our model in Py Torch and train on an Nvidia A100 GPU.
Software Dependencies No The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup Yes The model is trained with Adam optimizer with a learning rate of 1e 3 and takes 24 hours to converge. LCLS and Llocal relative weights are chosen by ablation. More details are available in the supplementary.