Unsupervised Object Detection with Theoretical Guarantees

Authors: Marian Longa, João F. Henriques

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform detailed analysis of how the error depends on each of these variables and perform synthetic experiments validating our theoretical predictions up to a precision of individual pixels. We also perform experiments on CLEVR-based data and show that, unlike current SOTA object detection methods (SAM, Cut LER), our method s prediction errors always lie within our theoretical bounds.
Researcher Affiliation Academia Marian Longa Visual Geometry Group University of Oxford mlonga@robots.ox.ac.uk João F. Henriques Visual Geometry Group University of Oxford joao@robots.ox.ac.uk
Pseudocode No The paper includes a network architecture diagram but no structured pseudocode or algorithm blocks.
Open Source Code No We will release the code upon publication.
Open Datasets Yes Our CLEVR experiments use data generated with the CLEVR [12] image generation script.
Dataset Splits No The paper states 'We divide the dataset into 4 quadrants and assign images from 3 quadrants to the training set and the remaining quadrant to the test set.' for synthetic experiments, and 'Our training and test sets consist of 150 and 50 images respectively' for CLEVR experiments, but does not explicitly mention a validation split.
Hardware Specification No We train each experiment on a single GPU for around 6 hours with <6GB memory on an internal cluster.
Software Dependencies No The paper describes network architectures and training parameters but does not specify software dependencies like libraries or frameworks with version numbers (e.g., PyTorch 1.x or TensorFlow 2.x).
Experiment Setup Yes We train each network for 500 epochs using the Adam optimiser with learning rate 10 3 and batch size 128. We train each network until convergence using the Adam optimiser with learning rate 10 2 and batch size 128... and with learning rate 10 3 and batch size 150 for the experiment measuring position error as a function of object size.