Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss
Authors: Xue Yang, Junchi Yan, Qi Ming, Wentao Wang, Xiaopeng Zhang, Qi Tian
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on five datasets using different detectors show the effectiveness of our approach, and codes are available at https://github.com/yangxue0827/ Rotation Detection. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University 3Huawei Inc. 4School of Automation, Beijing Institute of Technology. |
| Pseudocode | No | The paper contains mathematical formulas and descriptions of the method, but no structured pseudocode or algorithm blocks are provided. |
| Open Source Code | Yes | codes are available at https://github.com/yangxue0827/ Rotation Detection. and Source code will be made public available. |
| Open Datasets | Yes | DOTA (Xia et al., 2018) is comprised of 2,806 large aerial images from different sensors and platforms. and UCAS-AOD (Zhu et al., 2015) contains 1,510 aerial images... and HRSC2016 (Liu et al., 2017) contains images from two scenarios... and ICDAR2015 (Karatzas et al., 2015) is commonly used for oriented scene text detection... and ICDAR 2017 MLT (Nayef et al., 2017) is a multi-lingual text dataset... |
| Dataset Splits | Yes | Half of the original images are randomly selected as the training set, 1/6 as the validation set, and 1/3 as the testing set. We divide the images into 600 × 600 subimages with an overlap of 150 pixels and scale it to 800 × 800. With all these processes, we obtain about 20,000 training and 7,000 validation patches. and The training, validation and test set include 436, 181 and 444 images, respectively. |
| Hardware Specification | Yes | We use Tensorflow (Abadi et al., 2016) for implementation on a server with Tesla V100 and 32G memory. |
| Software Dependencies | No | The paper mentions TensorFlow but does not specify its version number or any other software dependencies with version numbers. |
| Experiment Setup | Yes | Weight decay and momentum are set 0.0001 and 0.9, respectively. We employ Momentum Optimizer over 8 GPUs with a total of 8 images per mini-batch (1 image per GPU). All the used datasets are trained by 20 epochs in total, and learning rate is reduced tenfold at 12 epochs and 16 epochs, respectively. The initial learning rates for Retina Net is 5e-4. |