A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection

Authors: Kemal Oksuz, Baris Can Cam, Emre Akbas, Sinan Kalkan

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments, Dataset: We train all our models on COCO trainval35K set [15] (115K images), test on minival set (5k images) and compare with the state-of-the-art (SOTA) on test-dev set (20K images)., Table 2: Ablation analysis on COCO minival., Table 6: Comparison with the SOTA detectors on COCO test-dev.
Researcher Affiliation Academia Kemal Oksuz, Baris Can Cam, Emre Akbas , Sinan Kalkan Dept. of Computer Engineering, Middle East Technical University Ankara, Turkey {kemal.oksuz, can.cam, eakbas, skalkan}@metu.edu.tr
Pseudocode Yes Algorithm 1 Obtaining the gradients of a ranking-based function with error-driven update.
Open Source Code Yes Code available at: https://github.com/kemaloksuz/aLRPLoss.
Open Datasets Yes Dataset: We train all our models on COCO trainval35K set [15] (115K images)... and Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft COCO: Common Objects in Context. In: The European Conference on Computer Vision (ECCV)
Dataset Splits Yes Dataset: We train all our models on COCO trainval35K set [15] (115K images), test on minival set (5k images) and compare with the state-of-the-art (SOTA) on test-dev set (20K images).
Hardware Specification Yes For training, we use 4 v100 GPUs.
Software Dependencies No The paper mentions using the 'mmdetection framework [6]' but does not specify its version number or other software dependencies with explicit version details (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Implementation Details: For training, we use 4 v100 GPUs. The batch size is 32 for training with 512 x 512 images (a LRPLoss500), whereas it is 16 for 800 x 800 images (a LRPLoss800). Following AP Loss, our models are trained for 100 epochs using stochastic gradient descent with a momentum factor of 0.9. We use a learning rate of 0.008 for a LRPLoss500 and 0.004 for a LRPLoss800, each decreased by factor 0.1 at epochs 60 and 80.