Dual Decoupling Training for Semi-supervised Object Detection with Noise-Bypass Head

Authors: Shida Zheng, Chenshu Chen, Xiaowei Cai, Tingqun Ye, Wenming Tan3526-3534

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method DDT is benchmarked on two popular detection datasets. One is PASCAL VOC, which includes VOC07 and VOC12 datasets. ... On MS-COCO benchmark, our method also achieves about 1.0 m AP improvements averaging across protocols compared with the prior state-of-the-art.
Researcher Affiliation Industry Hikvision Research Institute {zhengshida, chenchenshu, caixiaowei6, yetingqun, tanwenming}@hikvision.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No No explicit statement about the availability of open-source code or a link to a code repository was found.
Open Datasets Yes Our method DDT is benchmarked on two popular detection datasets. One is PASCAL VOC, which includes VOC07 and VOC12 datasets. ... MS-COCO (Lin et al. 2014)
Dataset Splits Yes In VOC07, we treat the trainval set as labeled data (5,011 images) and evaluate performance on the test set. Data from VOC12 trainval(11,540 images) and the subset of MS-COCO with the same classes as VOC (about 95k images) are used as extra unlabeled data. For MS-COCO, we randomly sample 1%/2%/5%/10% data from MS-COCO train2017 as the labeled data with the rest data as the unlabeled data. ... Table 2: The m AP at Io U=0.5:0.95 on MS-COCO val2017.
Hardware Specification Yes We adopt 8 NVIDIA Tesla V100 GPUs for all experiments.
Software Dependencies No The paper mentions software components like Faster RCNN, ResNet50, PyTorch (implicitly for deep learning frameworks), and SGD, but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The classification loss is cross entropy loss for the RPN and focal loss for the ROI heads. DDT introduces four hyper-parameters to decouple the clean and noisy data. We set τ c l = 0.4, τ c h = 0.6, τ b l = 0.6 and τ b h = 0.8 unless otherwise specified. The optimizer we use is SGD with a momentum of 0.9. The size of a mini-batch is 32 with 16 labeled and 16 unlabeled images. ... the learning rate keeps constant during semi-supervised training, with 0.04 for VOC and 0.02 for MS-COCO. EMA ratio is set as α = 1e 4.