reproducibilityindex.ai

I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification

Authors: Muhammad Ferjad Naeem, Yongqin Xian, Luc V Gool, Federico Tombari

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Quantitatively, we demonstrate that our I2DFormer significantly outperforms previous unsupervised semantic embeddings under both zero-shot and generalized zero-shot learning settings on three public datasets.We conduct extensive experiments on Animals with Attributes2 (AWA2) [57], Caltech-UCSD Birds (CUB) [51] and Oxford Flowers (FLO) [32], which are widely used datasets in ZSL.
Researcher Affiliation	Collaboration	Muhammad Ferjad Naeem1 Yongqin Xian1 Luc Van Gool1 Federico Tombari2,3 1 ETH Zürich 2 TUM 3Google
Pseudocode	No	The paper presents architectural diagrams and describes methods in text, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code available at https://github.com/ferjad/I2DFormer.
Open Datasets	Yes	We conduct extensive experiments on Animals with Attributes2 (AWA2) [57], Caltech-UCSD Birds (CUB) [51] and Oxford Flowers (FLO) [32], which are widely used datasets in ZSL.
Dataset Splits	Yes	We follow the evaluation protocol and data splits proposed by Xian et al. [57].
Hardware Specification	Yes	We implement our model in Py Torch and train on an Nvidia A100 GPU.
Software Dependencies	No	The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup	Yes	The model is trained with Adam optimizer with a learning rate of 1e 3 and takes 24 hours to converge. LCLS and Llocal relative weights are chosen by ablation. More details are available in the supplementary.