reproducibilityindex.ai

Instance-Conditional Knowledge Distillation for Object Detection

Authors: Zijian Kang, Peizhen Zhang, Xiangyu Zhang, Jian Sun, Nanning Zheng

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the efﬁcacy of our method: we observe impressive improvements under various settings. We perform comprehensive experiments on challenging benchmarks. Results demonstrate impressive improvements over various detectors with up to 4 AP gain in MS-COCO, including recent detectors for instance segmentation [41, 46, 16].
Researcher Affiliation	Collaboration	Zijian Kang Xi an Jiaotong University kzj123@stu.xjtu.edu.cn; Peizhen Zhang MEGVII Technology zhangpeizhen@megvii.com; Xiangyu Zhang MEGVII Technology zhangxiangyu@megvii.com; Jian Sun MEGVII Technology sunjian@megvii.com; Nanning Zheng Xi an Jiaotong University nnzheng@mail.xjtu.edu.cn
Pseudocode	No	The paper includes mathematical formulations and architectural diagrams (e.g., Figure 1, Figure 2), but does not contain structured pseudocode blocks or sections explicitly labeled 'Algorithm'.
Open Source Code	Yes	Code has been released on https://github.com/megvii-research/ICD.
Open Datasets	Yes	Most experiments are conducted on a large scale object detection benchmark MS-COCO 4[31] with 80 classes. MS-COCO is publicly available, the annotations are licensed under a Creative Commons Attribution 4.0 License and the use of the images follows Flickr Terms of Use. Refer to [31] for more details.
Dataset Splits	Yes	We train models on MS-COCO 2017 trainval115k subset and validate on minival subset.
Hardware Specification	Yes	All experiments are running on eight 2080ti GPUs with 2 images in each. Speciﬁcally, we benchmark on 1 schedule on Retina Net [30] with eight 2080ti, following the same conﬁguration in Section 4.2.
Software Dependencies	No	We conduct experiments on Pytorch [34] with the widely used Detectron2 library [47] and Adelai Det library 3 [40]. While these software components are mentioned, specific version numbers for PyTorch, Detectron2, or AdelaiDet are not provided in the text.
Experiment Setup	Yes	We adopt the 1 schedule, which denotes 9k iterations of training, following the standard protocols in Detectron2 unless otherwise speciﬁed. For distillation, the hyper-parameter λ is set to 8 for one-stage detectors and 3 for two-stage detectors respectively. To optimize the transformer decoder, we adopt Adam W optimizer [33] for the decoder and MLPs following common settings for transformer [43, 5]. Corresponding hyper-parameters follows [5], where the initial learning rate and weight decay are set to 1e-4. We adopt the 256 hidden dimension for our decoder and all MLPs, the decoder has 8 heads in parallel.