reproducibilityindex.ai

K-Net: Towards Unified Image Segmentation

Authors: Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	K-Net surpasses all previous published stateof-the-art single-model results of panoptic segmentation on MS COCO test-dev split and semantic segmentation on ADE20K val split with 55.2% PQ and 54.3% m Io U, respectively. Its instance segmentation performance is also on par with Cascade Mask R-CNN on MS COCO with 60%-90% faster inference speeds. Code and models will be released at https://github.com/Zww Wayne/K-Net/. To show the effectiveness of the proposed uniﬁed framework on different segmentation tasks, we conduct extensive experiments on COCO dataset [38] for panoptic and instance segmentation, and ADE20K dataset [70] for semantic segmentation.
Researcher Affiliation	Collaboration	1S-Lab, Nanyang Technological University 2CUHK-Sense Time Joint Lab, the Chinese University of Hong Kong 3Sense Time Research 4Shanghai AI Laboratory {wenwei001, ccloy}@ntu.edu.sg pangjiangmiao@gmail.com chenkai@sensetime.com
Pseudocode	No	The paper provides architectural diagrams (Figure 2, Figure 3) and describes the steps of the method in text, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Code and models will be released at https://github.com/Zww Wayne/K-Net/.
Open Datasets	Yes	we conduct extensive experiments on COCO dataset [38] for panoptic and instance segmentation, and ADE20K dataset [70] for semantic segmentation.
Dataset Splits	Yes	All models are trained on the train2017 split and evaluated on the val2017 split. ... All models are trained on the train split and evaluated on the validation split.
Hardware Specification	No	The paper mentions training on '16 GPUs' and '44 GPU days' but does not specify the type or model of the GPUs or other hardware components.
Software Dependencies	No	For panoptic and instance segmentation, we implement K-Net with MMDetection [6]. ... For semantic segmentation, we implement K-Net with MMSegmentation [13]. The paper mentions software frameworks but does not provide specific version numbers for these or other libraries/dependencies.
Experiment Setup	Yes	In the ablation study, the model is trained with a batch size of 16 for 12 epochs. The learning rate is 0.0001, and it is decreased by 0.1 after 8 and 11 epochs, respectively. We use Adam W [41] with a weight decay of 0.05. For data augmentation in training, we adopt horizontal ﬂip augmentation with a single scale. The long edge and short edge of images are resized to 1333 and 800, respectively, without changing the aspect ratio. When comparing with other frameworks, we use multi-scale training with a longer schedule (36 epochs) for fair comparisons [6]. The short edge of images is randomly sampled from [640, 800] [21]. For semantic segmentation, we implement K-Net with MMSegmentation [13] and train it with 80,000 iterations. As Adam W [41] empirically works better than SGD, we use Adam W with a weight decay of 0.0005 by default on both the baselines and K-Net for a fair comparison. The initial learning rate is 0.0001, and it is decayed by 0.1 after 60000 and 72000 iterations, respectively.