PolarNet: Learning to Optimize Polar Keypoints for Keypoint Based Object Detection

Authors: Wu Xiongwei, Doyen Sahoo, Steven HOI

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally in our experiments, we show that Polar Net, an anchor-free detector, outperforms the existing anchor-free detectors, and it is able to achieve highly competitive result on COCO test-dev benchmark (47.8% and 50.3% AP under the single-model single-scale and multi-scale testing). The code and the models are available at https://github.com/XiongweiWu/PolarNetV1
Researcher Affiliation Collaboration Xiongwei Wu1, Doyen Sahoo2, Steven C.H. Hoi1,2 1Singapore Management University 2Salesforce Research Asia
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks. It describes the architecture and training steps in textual form and diagrams.
Open Source Code Yes The code and the models are available at https://github.com/XiongweiWu/PolarNetV1
Open Datasets Yes We conducted experiments on MSCOCO 2017 dataset, which has 80 categories in three splits: train (115k images), val (5k images), and test-dev (20k images).
Dataset Splits Yes We conducted experiments on MSCOCO 2017 dataset, which has 80 categories in three splits: train (115k images), val (5k images), and test-dev (20k images).
Hardware Specification Yes Table 3 shows the results on COCO test-dev under the single-model settings, and the inference time is based on Titan-Xp.
Software Dependencies No The paper mentions software components like 'ResNet', 'ResNeXt', 'FPN', 'SGD optimization methods', and 'PyTorch' (in abstract), but does not provide specific version numbers for these or other ancillary software dependencies.
Experiment Setup Yes We train the model from weights pre-trained on Image Net classification task and other parameters are initialized by the same methods as Retina Net (Lin et al., 2017b). The model is trained with SGD optimization methods with 180k iterations with 16 images per mini-batch. The initial learning rate is set to 1e-2 and is reduced 10 times at 120k and 160k iterations. We re-scale the input images into 800x1333 pixels before training. We use the same data augmentation strategy presented in (Tian et al., 2019) when training the model, and for each image, the top-100 predictions are produced.