Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
Authors: Hanoona Bangalath, Muhammad Maaz, Muhammad Uzair Khattak, Salman H. Khan, Fahad Shahbaz Khan
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments demonstrate the improved OVD capability of the proposed approach. On COCO and LVIS benchmarks, our method achieves absolute gains of 8.2 and 5.0 AP on novel and rare classes over the current SOTA methods. |
| Researcher Affiliation | Academia | 1Mohamed bin Zayed University of AI, UAE 2Australian National University, Australia 3Linköping University, Sweden |
| Pseudocode | No | The paper contains mathematical equations and figures, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Code: https://github.com/hanoona R/object-centric-ovd. |
| Open Datasets | Yes | We conduct our experiments on COCO [42] and LVIS v1.0 [43] under OVD setting. ... Table 1 summarizes all the datasets used in our work. |
| Dataset Splits | Yes | We use COCO-2017 dataset for training and validation. We follow the ZS splits proposed in [10], in which 48 categories are selected as base and 17 are selected as novel classes. |
| Hardware Specification | Yes | All of our models are trained using 8 A100 GPUs with an approximate training time of 9 and 6 hours for 1x schedule of COCO and LVIS respectively. |
| Software Dependencies | No | The paper mentions specific deep learning models (e.g., Mask R-CNN, CLIP model Vi T-B/32) but does not provide specific software dependencies with version numbers for reproducibility (e.g., Python, PyTorch, or CUDA versions). |
| Experiment Setup | Yes | In our experiments, we use SGD optimizer with a weight decay of 1e 4 and a momentum of 0.9. We train for 1x schedule with batch size of 16 and an initial learning rate of 0.02 which drops by a factor of 10 at the 8th and 11th epoch. We set temperature τ to 50. Our longer schedules experiments use 100-1280 LSJ [47]. We use α of 0.1 to weight Lpms. |