Unsupervised Semantic Segmentation by Distilling Feature Correspondences
Authors: Mark Hamilton, Zhoutong Zhang, Bharath Hariharan, Noah Snavely, William T. Freeman
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | STEGO yields a significant improvement over the prior state of the art, on both the Coco Stuff (+14 m Io U) and Cityscapes (+9 m Io U) semantic segmentation challenges. Demonstrate that STEGO achieves state of the art performance on both the Coco Stuff (+14 m Io U) and Cityscapes (+9 m Io U) segmentation challenges. Justify STEGO s design with an ablation study on the Coco Stuff dataset. |
| Researcher Affiliation | Collaboration | Mark Hamilton MIT, Microsoft markth@mit.edu Zhoutong Zhang MIT Bharath Hariharan Cornell University Noah Snavely Cornell University, Google William T. Freeman MIT, Google |
| Pseudocode | No | The paper describes the methods and architectures but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We also provide training and evaluation code at https://aka.ms/stego-code |
| Open Datasets | Yes | We evaluate STEGO on the 27 mid-level classes of the Coco Stuff class hierarchy and on the 27 classes of Cityscapes. Like prior art, we first resize images to 320 pixels along the minor axis followed by a (320 320) center crops of each validation image. We use mean intersection over union (m Io U) and Accuracy for evaluation metrics. Our Coco Stuff evaluation setting originated in Ji et al. (2019) and is common in the literature. Our Cityscapes evaluation setting is adopted from Cho et al. (2021). Finally we also compare on the Potsdam-3 setting fro Ji et al. (2019) in Section A.2 of the Appendix. |
| Dataset Splits | Yes | We use the training and validation sets of Cocostuff described first in Ji et al. (2019) and used throughout the literature including in Cho et al. (2021). We note that the validation set used in Ji et al. (2019) is a subset of the full Coco Stuff validation set and we use this validation subset to be consistent with prior benchmarks. ... Training images are then scaled to have minor axis equal to 224 and are then center cropped to (224, 224), validation images are first scaled to 320 then are center cropped to (320, 320). |
| Hardware Specification | Yes | it only takes less than 2 hours on a single NVIDIA V100 GPU card. on an Ubuntu 16.04 Azure NV24 Virtual Machine with Python 3.6. |
| Software Dependencies | Yes | All experiments use Py Torch (Paszke et al., 2019) v1.7 pre-trained models, on an Ubuntu 16.04 Azure NV24 Virtual Machine with Python 3.6. Experiments use Py Torch Lightning for distributed and multi-gpu training when necessary (Falcon et al., 2019). |
| Experiment Setup | Yes | We use the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 0.0005 and a batch size of 32. ... Our cluster probe is trained alongside the STEGO architecture using a minibatch k-means loss where closeness is measured by cosine distance. Cluster and linear probes are trained with separate Adam optimizers using a learning rate of .005 ... Training images are then scaled to have minor axis equal to 224 and are then center cropped to (224, 224), validation images are first scaled to 320 then are center cropped to (320, 320). ... We use Py Dense CRF (Kr ahenb uhl & Koltun, 2011) with 10 iterations with parameters a = 4, b = 3, θα = 67, θβ = 3, θγ = 1 as written in Section A.9. ... Table 6: Hyperparameters used in STEGO |