HumanLiker: A Human-like Object Detector to Model the Manual Labeling Process
Authors: Haoran Wei, Ping Guo, Yangguang Zhu, Chenglong Liu, Peng Wang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed Human Liker on the large-scale object detection benchmark MS-COCO [21]. It contains 80 categories and more than 1.5 million object instances. We train on the train2017 set which contains 118k images and 860k instances and compare the performance of Human Liker with state-of-the-art methods on the test-dev set (20k images) via the online evaluation server. All ablation studies are performed on the val2017 set that contains 5k images and 36k objects. |
| Researcher Affiliation | Collaboration | 1University of Chinese Academy of Sciences 2Intel Labs China {weihaoran18, zhuyangguang19, liuchenglong20}@mails.ucas.ac.cn {ping.guo, patricia.p.wang}@intel.com |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code will be available at https://github.com/Ucas-Haoran Wei/Human Liker. |
| Open Datasets | Yes | We evaluate the proposed Human Liker on the large-scale object detection benchmark MS-COCO [21]. |
| Dataset Splits | Yes | We train on the train2017 set which contains 118k images and 860k instances and compare the performance of Human Liker with state-of-the-art methods on the test-dev set (20k images) via the online evaluation server. All ablation studies are performed on the val2017 set that contains 5k images and 36k objects. |
| Hardware Specification | Yes | most of our models are trained on 4 RTX 3090 GPUs with a batch-size of 16 under the SGD optimizer for 24 epochs using the Image Net [13] backbone initialization, if not otherwise specified. ... During the inference stage, we use a Titan Xp or 3090 GPU to test the inference speed. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | To be specific, most of our models are trained on 4 RTX 3090 GPUs with a batch-size of 16 under the SGD optimizer for 24 epochs using the Image Net [13] backbone initialization, if not otherwise specified. We apply a learning rate of 0.02 for the first 16 epochs and then decay it by 10 at the 16th and 22th epoch. Specially, the model with Swin Transformer (large) is trained on 8 GPUs with a batch-size of 8 for 36 epochs upon the Adam W optimizer. We adopt the multi-scale training strategy that the shorter side of each input image is randomly selected from a range of [480, 960]. |