Zero-Shot Object Detection with Textual Descriptions
Authors: Zhihui Li, Lina Yao, Xiaoqin Zhang, Xianzhi Wang, Salil Kanhere, Huaxiang Zhang8690-8697
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on three challenging benchmark datasets. The extensive experimental results confirm the superiority of the proposed model. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, University of New South Wales 2College of Mathematics and Information Science, Wenzhou University 3School of Software, University of Technology Sydney 4School of Information Science and Engineering, Shandong Normal University |
| Pseudocode | No | The paper describes the model architecture and training procedure in text and with a diagram, but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The code and models will be released. |
| Open Datasets | Yes | ILSVRC-2017 detection dataset Russakovsky et al. (2015) constitutes of 200 basic-level object categories. ... MSCOCO Lin et al. (2014) was collected for object detection and semantic segmentation tasks. ... Visual Genome (VG) Krishna et al. (2017) was designed primarily for visual relationship understanding. |
| Dataset Splits | No | For the ILSVRC-2017 detection dataset, we use the same train/test split as in Russakovsky et al. (2015). ... For the MSCOCO and Visual Genome datasets, we follow the same procedure as in Bansal et al. (2018). Specifically, we use 48 training classes and 17 test classes for MSCOCO, and 478 training classes and 130 test classes for Visual Genome. ... we randomly select images from the validation set for testing. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running the experiments. |
| Software Dependencies | No | We implement the proposed model based on the open-source package Py Torch. |
| Experiment Setup | Yes | Following Rahman, Khan, and Porikli (2018), we rescale the shorter size of images to 600 pixels. To reduce redundancy, we employ non-maximum suppression (NMS) on proposals class probability with Io U threshold equals 0.7. During training, we use the Adam optimizer with learning rate 10 5. |