reproducibilityindex.ai

Zero-Shot Object Detection with Textual Descriptions

Authors: Zhihui Li, Lina Yao, Xiaoqin Zhang, Xianzhi Wang, Salil Kanhere, Huaxiang Zhang8690-8697

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on three challenging benchmark datasets. The extensive experimental results conﬁrm the superiority of the proposed model.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, University of New South Wales 2College of Mathematics and Information Science, Wenzhou University 3School of Software, University of Technology Sydney 4School of Information Science and Engineering, Shandong Normal University
Pseudocode	No	The paper describes the model architecture and training procedure in text and with a diagram, but does not include any pseudocode or algorithm blocks.
Open Source Code	No	The code and models will be released.
Open Datasets	Yes	ILSVRC-2017 detection dataset Russakovsky et al. (2015) constitutes of 200 basic-level object categories. ... MSCOCO Lin et al. (2014) was collected for object detection and semantic segmentation tasks. ... Visual Genome (VG) Krishna et al. (2017) was designed primarily for visual relationship understanding.
Dataset Splits	No	For the ILSVRC-2017 detection dataset, we use the same train/test split as in Russakovsky et al. (2015). ... For the MSCOCO and Visual Genome datasets, we follow the same procedure as in Bansal et al. (2018). Speciﬁcally, we use 48 training classes and 17 test classes for MSCOCO, and 478 training classes and 130 test classes for Visual Genome. ... we randomly select images from the validation set for testing.
Hardware Specification	No	The paper does not provide any specific hardware details used for running the experiments.
Software Dependencies	No	We implement the proposed model based on the open-source package Py Torch.
Experiment Setup	Yes	Following Rahman, Khan, and Porikli (2018), we rescale the shorter size of images to 600 pixels. To reduce redundancy, we employ non-maximum suppression (NMS) on proposals class probability with Io U threshold equals 0.7. During training, we use the Adam optimizer with learning rate 10 5.