DamoFD: Digging into Backbone Design on Face Detection
Authors: Yang Liu, Jiankang Deng, Fei Wang, Lei Shang, Xuansong Xie, Baigui Sun
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments on the challenging Wider Face benchmark dataset and achieve dominant performance across a wide range of compute regimes. In particular, compared to the tiniest face detector SCRFD-0.5GF, our method is +2.5 % better in Average Precision (AP) score when using the same amount of FLOPs. |
| Researcher Affiliation | Collaboration | Yang Liu 1, Jiankang Deng 2, Fei Wang 1, Lei Shang 1, Xuansong Xie 1, Baigui Sun 1 * 1Alibaba Group 2Imperial College London |
| Pseudocode | Yes | Algorithm 1 Evolutionary Architecture Search |
| Open Source Code | Yes | The code is avaliable at https://github.com/ly19965/Easy Face/tree/ master/face_project/face_detection/Damo FD. |
| Open Datasets | Yes | In this paper, all experiments are conducted on the authoritative and challenging Wider Face Yang et al. (2016) dataset. |
| Dataset Splits | Yes | In each event, images are randomly separated into training (50%), validation (10%), and test (40%) sets. |
| Hardware Specification | Yes | We adopt the SGD optimizer (momentum 0.9, weight decay 5e-4) with a batch size of 8 × 4 and train on four Tesla V100s. |
| Software Dependencies | No | The paper mentions various components and techniques like “SGD optimizer”, “Generalised Focal Loss and DIoU Loss”, and “Group Normalization”, but does not specify software names with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | The population size and iteration in Algorithm 1 are set 256 and 96000, respectively. The convolution kernel size is searched from the set {3, 5, 7}. For the anchor setting, we tile anchors of {16, 32}, {64, 128}, {256, 512} on the feature maps with strides 8, 16, and 32, respectively. The optimization objectives of classification and localization branches are Generalised Focal Loss and DIoU Loss, respectively. For the optimization details, We adopt the SGD optimizer (momentum 0.9, weight decay 5e-4) with a batch size of 8 × 4 and train on four Tesla V100s. The initial learning rate is set to 1e-5, linearly warming up to 1e-2 within the first 3 epochs. |