Point-Teaching: Weakly Semi-supervised Object Detection with Point Annotations

Authors: Yongtao Ge, Qiang Zhou, Xinlong Wang, Chunhua Shen, Zhibin Wang, Hao Li

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate the effectiveness of our method on a few datasets and various data regimes. In particular, Point-Teaching outperforms the previous best method Group R-CNN by 3.1 AP with 5% fully labeled data and 2.3 AP with 30% fully labeled data on MS COCO dataset. [...] Extensive experiments are conducted on MS-COCO and VOC datasets to verify the effectiveness of our method.
Researcher Affiliation Collaboration Yongtao Ge1, Qiang Zhou2, Xinlong Wang3, Chunhua Shen4, Zhibin Wang2, Hao Li 2 1 The University of Adelaide 2 Alibaba Group 3 Beijing Academy of Artificial Intelligence 4 Zhejiang University
Pseudocode Yes The pseudo-code of point-wise MIL loss based on Py Torch is provided in the supplementary.
Open Source Code Yes The code is available at https://github.com/Yongtao Ge/Point-Teaching.
Open Datasets Yes We mainly benchmark our proposed method on the large-scale dataset MS-COCO (Lin et al. 2014). [...] We also conduct experiments on PASCAL VOC (Everingham et al. 2010).
Dataset Splits Yes Specifically, We randomly selected 0.5%, 1%, 2%, 5% , 10% and 30% from the 118k labeled images as the fully-labeled set, and the remainder is used as the point-labeled set. Model performance is evaluated on the COCO2017 val set.
Hardware Specification No The paper mentions training on '8 GPUs' but does not specify the model or type of GPU, CPU, or any other specific hardware components.
Software Dependencies No We implement our proposed Point-Teaching framework based on the Detectron2 toolbox (Wu et al. 2019). No version number for Detectron2 is specified.
Experiment Setup Yes Our method mainly contains three hyperparameters: τ, λ1 and λ2, which indicates the score threshold of the pseudo boxes, the loss weight of image-wise MIL loss and the loss weight of point-wise MIL loss, respectively. We set τ = 0.05, λ1 = 1.0 and λ2 = 0.05 unless otherwise specified. [...] On Pascal VOC, the models are trained for 40k iterations on 8 GPUs and with batch size 32, which contains 16 boxlabeled images and 16 point-labeled images respectively. [...] when conducting ablation experiments, we choose 1% MSCOCO protocol and take a quick learning schedule of 90k iterations and a smaller batch size of 32, containing 16 boxlabeled images and 16 point-labeled images, respectively.