Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search
Authors: Peidong Liu, Gengwei Zhang, Bochao Wang, Hang Xu, Xiaodan Liang, Yong Jiang, Zhenguo Li
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive evaluations of loss function search on popular detectors and validate the good generalization capability of searched losses across diverse architectures and various datasets. Our experiments show that the best-discovered loss function combinations outperform default combinations (Cross-entropy/Focal loss for classification and L1 loss for regression) by 1.1% and 0.8% in terms of m AP for two-stage and one-stage detectors on COCO respectively. |
| Researcher Affiliation | Collaboration | 1Tsinghua Shenzhen International Graduate School, Tsinghua University 2Sun Yat-Sen University 3Huawei Noah s Ark Lab |
| Pseudocode | Yes | Algorithm 1 CSE-Autoloss algorithm. |
| Open Source Code | Yes | Our searched losses are available at https://github.com/Perdon Liu/CSE-Autoloss. |
| Open Datasets | Yes | Datasets We conduct loss search on large-scale object detection dataset COCO (Lin et al., 2014) and further evaluate the best-searched loss combinations on datasets with different distribution and domain, i.e. PASCAL VOC (VOC) (Everingham et al., 2015) and Berkeley Deep Drive (BDD) (Yu et al., 2020). |
| Dataset Splits | Yes | COCO is a common dataset with 80 object categories for object detection, containing 118K images for training and 5K minival for validation. In the search experiment, we randomly sample 10k images from the training set for validation purpose. VOC contains 20 object categories. We use the union of VOC 2007 trainval and VOC 2012 trainval for training and VOC 2007 test for validation and report m AP using Io U at 0.5. BDD is an autonomous driving dataset with 10 object classes, in which 70k images are for training and 10k images are for validation. |
| Hardware Specification | No | The paper mentions using '4 GPUs with 4 images/GPU and 8 GPUs with 2 images/GPU' but does not specify the model or type of GPUs (e.g., NVIDIA A100, Tesla V100), or any CPU details. |
| Software Dependencies | No | The paper states that their code is 'based on MMDetection (Chen et al., 2019a) and DEAP (Fortin et al., 2012)', but it does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We apply Faster R-CNN and FCOS as the representative detectors for twostage and one-stage, respectively, in the loss search experiments on COCO for object detection. We apply Res Net-50 (He et al., 2016) and Feature Pyramid Network (Lin et al., 2017a) as feature extractor. For FCOS, we employ common tricks such as normalization on bounding box, centerness on regression, and center sampling. Besides that, we replace centerness branch with the Io U scores as the target instead of the original design for FCOS and ATSS to better utilize the Io U information, which has slight AP improvement. Note that loss weights are set default as MMDetection (Chen et al., 2019a) but the regression weight for ATSS sets to 1 for the search loss combinations. To be consistent with the authors implementation, we use 4 GPUs with 4 images/GPU and 8 GPUs with 2 images/GPU for FCOS and Faster-RCNN. Concerning the proxy task for loss function evaluation, we perform training on the whole COCO benchmark for only one epoch to trade off performance and efficiency. |