reproducibilityindex.ai

UMB: Understanding Model Behavior for Open-World Object Detection

Authors: Xing Xi, Yangyang Huang, Zhijie Zhong, Ronghua Luo

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The evaluation results on the Real-World Object Detection (RWD) benchmark, which consists of five real-world application datasets, show that we surpassed the previous state-of-the-art (SOTA) with an absolute gain of 5.3 m AP for unknown classes, reaching 20.5 m AP.
Researcher Affiliation	Academia	Xing Xi Yangyang Huang Zhijie Zhong Ronghua Luo School of Computer Science and Engineering South China University of Technology Guang Zhou, China 510006 Corresponding author: rhluo@scut.edu.cn.
Pseudocode	Yes	Algorithm 1: Textual Attribute Generation and Known Class Prediction
Open Source Code	Yes	Our code is available at https://github.com/xxyzll/UMB.
Open Datasets	Yes	The OWOD benchmark is established on the VOC[31] and COCO[30] datasets. ... The RWD benchmark consists of five typical application scenarios for object detection, including underwater scenes, representing visual blurring caused by the environment (Aquatic[32]); aerial scenes, where the targets are small and difficult to distinguish (Aerial[33]); scenarios using synthetic data when data is lacking (Game[34]); medical X-ray scenes, where it is difficult to distinguish between categories and professional knowledge is required(Medical[35]); and human surgery scenes, where the field of view is blurred by blood (Surgery[36]).
Dataset Splits	No	We divide RWD into two subtasks according to a 50% category ratio. When training in Task 1, all categories in the test set that belong to Task 2 are treated as unknown classes, and when training in Task 2, the categories of Task 1 are considered as previously seen classes.
Hardware Specification	Yes	All experiments were conducted using a single NVIDIA Ge Force RTX 4090 GPU.
Software Dependencies	No	The large language model used for attribute generation is GPT-3.5. All optimizers used Adam W.
Experiment Setup	Yes	During the attribute selection phase, BCE was the loss function, and the learning rate remained constant without decreasing with iterations. ... This phase used MSE as the loss function, with a maximum of 1000 iterations. ... the learning rate and maximum number of iterations for training were set to three values ([1e-5, 5e-5, 1e-4], [1, 10, 100]). ... In the distribution optimization phase, we set the window value to 10. ... During training, Adam was used as the optimizer, the learning rate was set to 0.01, the maximum number of iterations was 10000, and the maximum number of probability models was set to 5.