GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models

Authors: Chen Liang, Wenguan Wang, Jiaxu Miao, Yi Yang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental With a variety of segmentation architectures and backbones, GMMSeg outperforms the discriminative counterparts on three closed-set datasets. More impressively, without any modification, GMMSeg even performs well on open-world datasets. We respectively examine the efficacy and robustness of GMMSeg on semantic segmentation ( 4.1) and anomaly segmentation ( 4.2). In 4.3, we provide diagnostic analysis on our core model design.
Researcher Affiliation Collaboration Chen Liang1,3 , Wenguan Wang2 , Jiaxu Miao1, Yi Yang1 1CCAI, Zhejiang University 2Re LER, AAII, University of Technology Sydney 3Baidu Research
Pseudocode No The paper includes equations and figures but does not contain any clearly labeled pseudocode or algorithm blocks. Procedural steps are described within the text or through equations.
Open Source Code Yes https://github.com/leonnnop/GMMSeg. We promise code and instructions shall be made publicly available right after acceptance.
Open Datasets Yes We conduct experiments on three widely used semantic segmentation datasets: ADE20K [53] has 20K/2K/3K images in train/val/test set, with 150 stuff/object categories in total. Cityscapes [54] has 2,975/500/1,524 fine-labeled images for train/val/test set with 19 classes. COCO-Stuff [55] has 10K images (9K/1K for train/test), pixel-wise labeled with 171 classes.
Dataset Splits Yes ADE20K [53] has 20K/2K/3K images in train/val/test set, with 150 stuff/object categories in total. Cityscapes [54] has 2,975/500/1,524 fine-labeled images for train/val/test set with 19 classes. COCO-Stuff [55] has 10K images (9K/1K for train/test), pixel-wise labeled with 171 classes.
Hardware Specification Yes using 8/16 NVIDIA Tesla A100 GPUs. We measure the fps on a single NVIDIA Ge Force RTX 3090 GPU with a batch size of one.
Software Dependencies No The paper mentions 'GMMSeg is implemented on MMSegmentation [124]' but does not provide specific version numbers for MMSegmentation or any other software components (e.g., Python, PyTorch, CUDA) required for reproducibility.
Experiment Setup Yes For ADE20K/COCO-Stuff/Cityscapes, images are cropped to 512 512/512 512/768 768 and models are trained for 160K/80K/80K iterations with 16/16/8 batch size, using 8/16 NVIDIA Tesla A100 GPUs.