Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
K-Net: Towards Unified Image Segmentation
Authors: Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | K-Net surpasses all previous published stateof-the-art single-model results of panoptic segmentation on MS COCO test-dev split and semantic segmentation on ADE20K val split with 55.2% PQ and 54.3% m Io U, respectively. Its instance segmentation performance is also on par with Cascade Mask R-CNN on MS COCO with 60%-90% faster inference speeds. Code and models will be released at https://github.com/Zww Wayne/K-Net/. To show the effectiveness of the proposed unified framework on different segmentation tasks, we conduct extensive experiments on COCO dataset [38] for panoptic and instance segmentation, and ADE20K dataset [70] for semantic segmentation. |
| Researcher Affiliation | Collaboration | 1S-Lab, Nanyang Technological University 2CUHK-Sense Time Joint Lab, the Chinese University of Hong Kong 3Sense Time Research 4Shanghai AI Laboratory EMAIL EMAIL EMAIL |
| Pseudocode | No | The paper provides architectural diagrams (Figure 2, Figure 3) and describes the steps of the method in text, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Code and models will be released at https://github.com/Zww Wayne/K-Net/. |
| Open Datasets | Yes | we conduct extensive experiments on COCO dataset [38] for panoptic and instance segmentation, and ADE20K dataset [70] for semantic segmentation. |
| Dataset Splits | Yes | All models are trained on the train2017 split and evaluated on the val2017 split. ... All models are trained on the train split and evaluated on the validation split. |
| Hardware Specification | No | The paper mentions training on '16 GPUs' and '44 GPU days' but does not specify the type or model of the GPUs or other hardware components. |
| Software Dependencies | No | For panoptic and instance segmentation, we implement K-Net with MMDetection [6]. ... For semantic segmentation, we implement K-Net with MMSegmentation [13]. The paper mentions software frameworks but does not provide specific version numbers for these or other libraries/dependencies. |
| Experiment Setup | Yes | In the ablation study, the model is trained with a batch size of 16 for 12 epochs. The learning rate is 0.0001, and it is decreased by 0.1 after 8 and 11 epochs, respectively. We use Adam W [41] with a weight decay of 0.05. For data augmentation in training, we adopt horizontal flip augmentation with a single scale. The long edge and short edge of images are resized to 1333 and 800, respectively, without changing the aspect ratio. When comparing with other frameworks, we use multi-scale training with a longer schedule (36 epochs) for fair comparisons [6]. The short edge of images is randomly sampled from [640, 800] [21]. For semantic segmentation, we implement K-Net with MMSegmentation [13] and train it with 80,000 iterations. As Adam W [41] empirically works better than SGD, we use Adam W with a weight decay of 0.0005 by default on both the baselines and K-Net for a fair comparison. The initial learning rate is 0.0001, and it is decayed by 0.1 after 60000 and 72000 iterations, respectively. |