CLUSTSEG: Clustering for Universal Segmentation

Authors: James Chenhao Liang, Tianfei Zhou, Dongfang Liu, Wenguan Wang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present CLUSTSEG, a general, transformerbased framework that tackles different image segmentation tasks (i.e., superpixel, semantic, instance, and panoptic) through a unified, neural clusteringscheme.Regardingqueriesasclustercenters, CLUSTSEG is innovative in two aspects:â‘ cluster centers are initialized in heterogeneous ways so as to pointedly address task-specific demands (e.g., instanceor category-level distinctiveness), yet without modifying the architecture; and â‘¡pixelcluster assignment, formalized in a cross-attention fashion, is alternated with cluster center update, yet without learning additional parameters. These innovations closely link CLUSTSEG to EM clustering and make it a transparent and powerful framework that yields superior results across the above segmentation tasks.
Researcher Affiliation Academia 1Rochester Institute of Technology 2ETH Zurich 3Zhejiang University.
Pseudocode Yes C. Pseudo code
Open Source Code Yes https://github.com/James Liang819/Clust Seg
Open Datasets Yes We use COCO Panoptic (Kirillov et al., 2019b) train2017is adopted for training and val2017 for test.
Dataset Splits Yes COCO Panoptic is divided into 115K/5K/20K images for train/ val/test split.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No CLUSTSEG is implemented in Py Torch. All the backbones are initialized using corresponding weights pre-trained on Image Net-1K/-22K (Deng et al., 2009), while the remaining layers are randomly initialized. We train all our models using Adam W optimizer and cosine annealing learning rate decay policy. For panoptic, instance, and semantic segmentation, we adopt the default training recipes of MMDetection (Chen et al., 2019b).
Experiment Setup Yes We set the initial learning rate to 1e-5, training epoch to 50, and batch size to 16. We use random scale jittering with a factor in [0.1, 2.0] and a crop size of 1024 1024.