reproducibilityindex.ai

Zero-Shot Aerial Object Detection with Visual Description Regularization

Authors: Zhengqing Zang, Chenyu Lin, Chenwei Tang, Tao Wang, Jiancheng Lv

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments with three challenging aerial object detection datasets, including DIOR, x View, and DOTA. The results demonstrate that Desc Reg significantly outperforms the state-of-the-art ZSD methods with complex projection designs and generative frameworks, e.g., Desc Reg outperforms best reported ZSD method on DIOR by 4.5 m AP on unseen classes and 8.1 in HM. We further show the generalizability of Desc Reg by integrating it into generative ZSD methods as well as varying the detection architecture.
Researcher Affiliation	Academia	1College of Computer Science, Sichuan University, Chengdu, 610065, P. R. China 2Engineering Research Center of Machine Learning and Industry Intelligence, Ministry of Education, Chengdu, 610065, P. R. China {2022223045158, 2022223040017}@stu.scu.edu.cn, tangchenwei@scu.edu.cn, twangnh@gmail.com lvjiancheng@scu.edu.cn
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks. It provides mathematical formulations and a system diagram, but no step-by-step code-like description of the algorithms.
Open Source Code	Yes	Codes will be released at https://github.com/zq-zang/Desc Reg.
Open Datasets	Yes	We evaluate the proposed method on three challenging remote sensing image object detection datasets: DIOR (Li et al. 2019a), x View (Lam et al. 2018), and DOTA (Xia et al. 2017).
Dataset Splits	Yes	For DIOR, we follow the setting in prior work (Huang et al. 2022). For x View and DOTA, we conduct semantic clustering and sample classes within clusters to ensure unseen class diversity and semantic relatness(Rahman, Khan, and Porikli 2018; Huang et al. 2022). The resulting x View contains 48 seen classes and 12 unseen classes, and the resulting DOTA contains 11 seen classes and 4 unseen classes.
Hardware Specification	No	The paper does not explicitly state the specific hardware used for running experiments, such as GPU models, CPU types, or memory specifications. It only mentions general detection architectures like Faster R-CNN and YOLOv8.
Software Dependencies	No	The paper mentions using Faster R-CNN, ResNet101, and YOLOv8 models but does not provide specific version numbers for software dependencies like Python, PyTorch, or other libraries used in the implementation.
Experiment Setup	Yes	We also observe the temperature value of 0.03 achieves the best performance, which is slightly higher than 0.01 and 0.05.