Point-Teaching: Weakly Semi-supervised Object Detection with Point Annotations
Authors: Yongtao Ge, Qiang Zhou, Xinlong Wang, Chunhua Shen, Zhibin Wang, Hao Li
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate the effectiveness of our method on a few datasets and various data regimes. In particular, Point-Teaching outperforms the previous best method Group R-CNN by 3.1 AP with 5% fully labeled data and 2.3 AP with 30% fully labeled data on MS COCO dataset. [...] Extensive experiments are conducted on MS-COCO and VOC datasets to verify the effectiveness of our method. |
| Researcher Affiliation | Collaboration | Yongtao Ge1, Qiang Zhou2, Xinlong Wang3, Chunhua Shen4, Zhibin Wang2, Hao Li 2 1 The University of Adelaide 2 Alibaba Group 3 Beijing Academy of Artificial Intelligence 4 Zhejiang University |
| Pseudocode | Yes | The pseudo-code of point-wise MIL loss based on Py Torch is provided in the supplementary. |
| Open Source Code | Yes | The code is available at https://github.com/Yongtao Ge/Point-Teaching. |
| Open Datasets | Yes | We mainly benchmark our proposed method on the large-scale dataset MS-COCO (Lin et al. 2014). [...] We also conduct experiments on PASCAL VOC (Everingham et al. 2010). |
| Dataset Splits | Yes | Specifically, We randomly selected 0.5%, 1%, 2%, 5% , 10% and 30% from the 118k labeled images as the fully-labeled set, and the remainder is used as the point-labeled set. Model performance is evaluated on the COCO2017 val set. |
| Hardware Specification | No | The paper mentions training on '8 GPUs' but does not specify the model or type of GPU, CPU, or any other specific hardware components. |
| Software Dependencies | No | We implement our proposed Point-Teaching framework based on the Detectron2 toolbox (Wu et al. 2019). No version number for Detectron2 is specified. |
| Experiment Setup | Yes | Our method mainly contains three hyperparameters: τ, λ1 and λ2, which indicates the score threshold of the pseudo boxes, the loss weight of image-wise MIL loss and the loss weight of point-wise MIL loss, respectively. We set τ = 0.05, λ1 = 1.0 and λ2 = 0.05 unless otherwise specified. [...] On Pascal VOC, the models are trained for 40k iterations on 8 GPUs and with batch size 32, which contains 16 boxlabeled images and 16 point-labeled images respectively. [...] when conducting ablation experiments, we choose 1% MSCOCO protocol and take a quick learning schedule of 90k iterations and a smaller batch size of 32, containing 16 boxlabeled images and 16 point-labeled images, respectively. |