Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss

Authors: Xue Yang, Junchi Yan, Qi Ming, Wentao Wang, Xiaopeng Zhang, Qi Tian

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on five datasets using different detectors show the effectiveness of our approach, and codes are available at https://github.com/yangxue0827/ Rotation Detection.
Researcher Affiliation Collaboration 1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University 3Huawei Inc. 4School of Automation, Beijing Institute of Technology.
Pseudocode No The paper contains mathematical formulas and descriptions of the method, but no structured pseudocode or algorithm blocks are provided.
Open Source Code Yes codes are available at https://github.com/yangxue0827/ Rotation Detection. and Source code will be made public available.
Open Datasets Yes DOTA (Xia et al., 2018) is comprised of 2,806 large aerial images from different sensors and platforms. and UCAS-AOD (Zhu et al., 2015) contains 1,510 aerial images... and HRSC2016 (Liu et al., 2017) contains images from two scenarios... and ICDAR2015 (Karatzas et al., 2015) is commonly used for oriented scene text detection... and ICDAR 2017 MLT (Nayef et al., 2017) is a multi-lingual text dataset...
Dataset Splits Yes Half of the original images are randomly selected as the training set, 1/6 as the validation set, and 1/3 as the testing set. We divide the images into 600 × 600 subimages with an overlap of 150 pixels and scale it to 800 × 800. With all these processes, we obtain about 20,000 training and 7,000 validation patches. and The training, validation and test set include 436, 181 and 444 images, respectively.
Hardware Specification Yes We use Tensorflow (Abadi et al., 2016) for implementation on a server with Tesla V100 and 32G memory.
Software Dependencies No The paper mentions TensorFlow but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup Yes Weight decay and momentum are set 0.0001 and 0.9, respectively. We employ Momentum Optimizer over 8 GPUs with a total of 8 images per mini-batch (1 image per GPU). All the used datasets are trained by 20 epochs in total, and learning rate is reduced tenfold at 12 epochs and 16 epochs, respectively. The initial learning rates for Retina Net is 5e-4.