DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection
Authors: Jia S Lim, Zhuoxiao Chen, Zhi Chen, Mahsa Baktashmotlagh, Xin Yu, Zi Huang, Yadan Luo
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of Di PEx through extensive class-agnostic OD and OOD-OD experiments on MS-COCO and LVIS, surpassing other prompting methods by up to 20.1% in AR and achieving a 21.3% AP improvement over SAM. |
| Researcher Affiliation | Academia | The University of Queensland {jiasyuen.lim, zhuoxiao.chen, m.baktashmotlagh}@uq.edu.au {zhi.chen, xin.yu, helen.huang, y.luo}@uq.edu.au |
| Pseudocode | Yes | Algorithm 1 The Proposed Di PEx for Class-Agnostic Object Detection |
| Open Source Code | Yes | The code is available at https://github.com/jason-lim26/Di PEx. |
| Open Datasets | Yes | Datasets. We conduct our experiments using two detection datasets: 1). MS-COCO [32], a large-scale object detection and instance segmentation dataset, comprising approximately 115K training images and 5K validation images across 80 classes. 2). LVIS [15] includes 2.2 million high-quality instance segmentation masks covering 1,000 class labels, resulting in a long-tailed data distribution. It consists of around 100K training images and 19.8K validation images. |
| Dataset Splits | Yes | Datasets. We conduct our experiments using two detection datasets: 1). MS-COCO [32], a large-scale object detection and instance segmentation dataset, comprising approximately 115K training images and 5K validation images across 80 classes. 2). LVIS [15]... It consists of around 100K training images and 19.8K validation images. |
| Hardware Specification | Yes | Our code is developed on the Open Grounding-DINO framework [63], and operates on a single NVIDIA RTX A6000 GPU with 48 GB of memory. |
| Software Dependencies | No | The paper states: “Our code is developed on the Open Grounding-DINO framework [63],” but does not provide specific version numbers for this framework or any other software dependencies like Python, PyTorch, etc. |
| Experiment Setup | Yes | For our experiments, we choose a batch size of 8 for training, and set hyperparameter γ = 0.1, τp = 0.1, τc = 0.1, θ = 15 , K = 9, L = 3, and while adopting all remaining hyperparameters from the Open Grounding-DINO codebase. |