CLUSTSEG: Clustering for Universal Segmentation
Authors: James Chenhao Liang, Tianfei Zhou, Dongfang Liu, Wenguan Wang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present CLUSTSEG, a general, transformerbased framework that tackles different image segmentation tasks (i.e., superpixel, semantic, instance, and panoptic) through a unified, neural clusteringscheme.Regardingqueriesasclustercenters, CLUSTSEG is innovative in two aspects:â‘ cluster centers are initialized in heterogeneous ways so as to pointedly address task-specific demands (e.g., instanceor category-level distinctiveness), yet without modifying the architecture; and â‘¡pixelcluster assignment, formalized in a cross-attention fashion, is alternated with cluster center update, yet without learning additional parameters. These innovations closely link CLUSTSEG to EM clustering and make it a transparent and powerful framework that yields superior results across the above segmentation tasks. |
| Researcher Affiliation | Academia | 1Rochester Institute of Technology 2ETH Zurich 3Zhejiang University. |
| Pseudocode | Yes | C. Pseudo code |
| Open Source Code | Yes | https://github.com/James Liang819/Clust Seg |
| Open Datasets | Yes | We use COCO Panoptic (Kirillov et al., 2019b) train2017is adopted for training and val2017 for test. |
| Dataset Splits | Yes | COCO Panoptic is divided into 115K/5K/20K images for train/ val/test split. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | CLUSTSEG is implemented in Py Torch. All the backbones are initialized using corresponding weights pre-trained on Image Net-1K/-22K (Deng et al., 2009), while the remaining layers are randomly initialized. We train all our models using Adam W optimizer and cosine annealing learning rate decay policy. For panoptic, instance, and semantic segmentation, we adopt the default training recipes of MMDetection (Chen et al., 2019b). |
| Experiment Setup | Yes | We set the initial learning rate to 1e-5, training epoch to 50, and batch size to 16. We use random scale jittering with a factor in [0.1, 2.0] and a crop size of 1024 1024. |