Semi-supervised Object Detection with Adaptive Class-Rebalancing Self-Training

Authors: Fangyuan Zhang, Tianxiang Pan, Bin Wang3252-3261

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method achieves competitive performance on MS-COCO and VOC benchmarks. When using only 1% labeled data of MS-COCO, our method achieves 17.02 m AP improvement over the supervised method and 5.32 m AP gains compared with state-of-the-arts. We evaluate our method on three SSOD benchmarks from MS-COCO (Lin et al. 2014) and PASCAL VOC (Everingham et al. 2010).
Researcher Affiliation Academia Fangyuan Zhang,1,2 Tianxiang Pan, 1,2 Bin Wang 1,2* 1 School of Software, Tsinghua Unviersity 2 Beijing National Research Center for Information Science and Technology zhangfy19@mails.tsinghua.edu.cn, ptx9363@gmail.com, wangbins@tsinghua.edu.cn
Pseudocode No The paper describes methods and equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper states it builds its framework upon Detectron2 (Wu et al. 2019) and cites it, but there is no explicit statement about the authors releasing their specific implementation code or a link to their repository.
Open Datasets Yes We evaluate our method on three SSOD benchmarks from MS-COCO (Lin et al. 2014) and PASCAL VOC (Everingham et al. 2010).
Dataset Splits Yes We evaluate our method on three SSOD benchmarks from MS-COCO (Lin et al. 2014) and PASCAL VOC (Everingham et al. 2010). (1).COCO-standard: We sample 0.5/1/2/5/10% of the COCO2017-train as the labeled dataset and take the remaining data as the unlabeled dataset. (2).COCO-additional: We use the COCO2017train as the labeled dataset and the additional COCO2017unlabeled as the unlabeled dataset. (3).VOC: We use the VOC07-trainval as the labeled dataset and the VOC12trainval as the unlabeled dataset. We evaluate the model on the COCO2017-val for (1)(2) and VOC07-test for (3).
Hardware Specification No The paper does not provide specific details regarding the hardware (e.g., GPU or CPU models) used for running the experiments.
Software Dependencies No The paper mentions building the framework upon Detectron2 and using a ResNet50 backbone, but it does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes For fair comparisons, we follow previous methods (Sohn et al. 2020; Liu et al. 2021) to use Faster-RCNN with FPN and Res Net50 and build our framework upon the Detectron2 (Wu et al. 2019). Following (Liu et al. 2021), the batch-sizes of labeled and unlabeled images are both 32. We use the SGD optimizer with learning rate=0.01 and momentum rate=0.9. We set λema = 0.9996, τcls = 0.7, λunsup = 4. For specific parameters in our work, we set β = 0.6, and τml = 0.2. The pre-training takes 3000/5000/5000/5000/10000 steps and the total training takes 180000 steps for 0.5/1/2/5/10% COCO-standard. For VOC, the pre-training takes 5000 steps and the total training takes 72000 steps. We apply color jittering, Gaussian blur and Cut Out for strong augmentations, and we apply randomly resize and flip, crop for weak augmentations. The widely used m AP (AP50:95) serves as metric for comparisons. For SSMLL, the batch-sizes of labeled and unlabeled images are both 64. The pre-training takes 2k/2k/6k steps and the total training takes 18k/36k/96k steps for VOC/COCO-standard/COCO-additional, where we use Adam optimizer with lr=1e-5. Data augmentations are the same with SSOD but images are resized into 576*576.