Learning to Discover and Detect Objects
Authors: Vladimir Fomenko, Ismail Elezi, Deva Ramanan, Laura Leal-Taixé, Aljosa Osep
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments conducted using COCO and LVIS datasets reveal that our method is significantly more effective than multi-stage pipelines that rely on traditional clustering algorithms. Furthermore, we demonstrate the generality of our approach by applying our method to a large-scale Visual Genome dataset, where our network successfully learns to detect various semantic classes without direct supervision. |
| Researcher Affiliation | Collaboration | Vladimir Fomenko1 Ismail Elezi2 Deva Ramanan3 Laura Leal-Taixé2 Aljoša Ošep2,3 1Microsoft Azure AI 2Technical University of Munich 3Carnegie Mellon University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/vlfom/RNCDL. |
| Open Datasets | Yes | We re-purpose COCO 2017 [43] and LVIS v1 [24] datasets for running ablations and comparisons with baselines and prior art. |
| Dataset Splits | Yes | We follow LVIS training and validation splits for our experiments, resulting in 100K training images and 20K validation images. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions Detectron2 [69] as the base for its model implementation but does not provide specific version numbers for software libraries or dependencies. |
| Experiment Setup | Yes | We train our network using the standard R-CNN loss [53] Lsup = LRP N +Lbox +Lcls, that consists of RPN classification LRP N loss, box regression loss Lbox, and second-stage classification loss Lcls. We set the strength of the supervised loss to 0.5 in all experiments. the optimal number of proposals per image is 50. |