reproducibilityindex.ai

Improved Visual-Semantic Alignment for Zero-Shot Object Detection

Authors: Shafin Rahman, Salman Khan, Nick Barnes11932-11939

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive results on MS-COCO and Pascal VOC datasets show signiﬁcant improvements over state of the art.
Researcher Affiliation	Collaboration	Shaﬁn Rahman,1,2 Salman Khan,3,1 Nick Barnes1,2 1College of Engineering and Computer Science, Australian National University 2Data61, Commonwealth Scientiﬁc and Industrial Research Organisation 3Inception Institute of Artiﬁcial Intelligence, Abu Dhabi, UAE
Pseudocode	No	The paper describes mathematical formulations and network architectures but does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	Code and evaluation protocols available at: https://github.com/salman-h-khan/PL-ZSD Release
Open Datasets	Yes	We evaluate our method with MS-COCO (2014) (Lin et al. 2014) and Pascal VOC (2007/12) (Everingham et al. 2010).
Dataset Splits	Yes	With 80 object classes, MS-COCO includes 82,783 training and 40,504 validation images. For the ZSD task, only unseen class performance is of interest. As the test data labels are not known, the ZSD evaluation is done on a subset of validation data.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies	No	The paper mentions using Retina Net (Lin et al. 2018) as the base architecture but does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks.
Experiment Setup	Yes	We train the classiﬁcation subnet branch with our proposed loss deﬁned in Eq. 6. Similar to (Lin et al. 2018), to address the imbalance between hard and easy examples, we normalize the total classiﬁcation loss (calculated from 100k anchors) by the total number of object/positive anchor boxes rather than the total number of anchors. We use standard smooth L1 loss for the box-regression subnet branch. The total loss is the sum of the loss of both branches. Hyper-parameters are set on the validation set: β=5, Io U=0.5. Our model works best with α=.25 and γ=2.0 which are also the recommended values in FL. Empirically, we found ts=0.3 and tu=0.1 generally work well.