The KFIoU Loss for Rotated Object Detection

Authors: Xue Yang, Yue Zhou, Gefan Zhang, Jirui Yang, Wentao Wang, Junchi Yan, XIAOPENG ZHANG, Qi Tian

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive results on various datasets with different base detectors show the effectiveness of our approach.
Researcher Affiliation Collaboration 1Mo E Key Lab of Artificial Intelligence, Shanghai Jiao Tong University 2COWAROBOT Co. Ltd. 3University of Chinese Academy of Sciences 4Huawei Cloud
Pseudocode No The paper describes methods textually and with mathematical equations, but does not include any labeled pseudocode or algorithm blocks.
Open Source Code Yes Jittor Code: https://github.com/Jittor/JDet Py Torch Code: https://github.com/open-mmlab/mmrotate Tensor Flow Code: https://github.com/yangxue0827/Rotation Detection
Open Datasets Yes Aerial image dataset: DOTA (Xia et al., 2018) is one of the largest datasets for oriented object detection in aerial images... HRSC2016 (Liu et al., 2017)... Scene text dataset: ICDAR2015 (Karatzas et al., 2015)... MSRA-TD500 (Yao et al., 2012)... Face dataset: FDDB (Jain & Learned-Miller, 2010)... KITTI (Geiger et al., 2012)
Dataset Splits Yes HRSC2016 (Liu et al., 2017)... The training, validation and test set include 436, 181 and 444 images. ...FDDB (Jain & Learned-Miller, 2010)... We manually use 70% as the training set and the rest as the validation set. ...KITTI (Geiger et al., 2012)... The training samples are generally divided into the train split (3,712 samples) and the val split (3,769 samples).
Hardware Specification Yes Experiments are performed on a server with Ge Force RTX 3090 Ti and 24G memory.
Software Dependencies No The paper mentions deep learning frameworks like TensorFlow, PyTorch, and Jittor, but does not provide specific version numbers for these or any other ancillary software dependencies required for reproduction.
Experiment Setup Yes Weight decay and momentum are set 0.0001 and 0.9, respectively. We employ Momentum Optimizer over 4 GPUs with a total of 4 images per mini-batch (1 image per GPU). All the used datasets are trained by 20 epochs, and learning rate is reduced tenfold at 12 epochs and 16 epochs, respectively. The initial learning rate is 1e-3. ...For Res Net (He et al., 2016), SGD optimizer is adopted with an initial learning rate of 0.0025. The momentum and weight decay are 0.9 and 0.0001, respectively. For Swin Transformer (Liu et al., 2021), Adam W (Kingma & Ba, 2014; Loshchilov & Hutter, 2018) optimizer is adopted with an initial learning rate of 0.0001. The weight decay is 0.05. In addition, we adopt learning rate warmup for 500 iterations, and the learning rate is divided by 10 at each decay step.