reproducibilityindex.ai

Hierarchy-Agnostic Unsupervised Segmentation: Parsing Semantic Image Structure

Authors: Simone Rossetti, Fiora Pirri

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a new metric for estimating the quality of the semantic segmentation of discovered elements on different levels of the hierarchy. The metric validates the intrinsic nature of the compositional relations among parts, objects, and scenes in a hierarchy-agnostic domain. Our results prove the power of this methodology, uncovering semantic regions without prior definitions and scaling effectively across various datasets. This robust framework for unsupervised image segmentation proves more accurate semantic hierarchical relationships between scene elements than traditional algorithms. The experiments underscore its potential for broad applicability in image analysis tasks, showcasing its ability to deliver a detailed and unbiased segmentation that surpasses existing unsupervised methods.
Researcher Affiliation	Collaboration	Simone Rossetti1,2 Fiora Pirri1,2 1DIAG, Sapienza University of Rome 2Deep Plants {rossetti,pirri}@diag.uniroma1.it {simone,fiora}@deepplants.com
Pseudocode	Yes	In Appendix B, we discuss the algorithm s properties and the generated T, and present the complete pseudocode of our method.
Open Source Code	Yes	We provided code for reproducing experiments in Table 1 in the supplementary material. We will release the full code upon acceptance.
Open Datasets	Yes	We benchmark our algorithm on unsupervised multi-granular segmentation using seven major object- and scene-centric datasets and seven hierarchically structured datasets with varying granularity levels for hierarchy-agnostic segmentation. We only utilize publicly available datasets, SSL model checkpoints without retraining, and validation set ground-truth annotations.
Dataset Splits	Yes	We only utilize publicly available datasets, SSL model checkpoints without retraining, and validation set ground-truth annotations.
Hardware Specification	Yes	We ran experiments on an ASUS ESC8000 server with two AMD EPYC 7413 24-core processors and 256GB RAM. We used the Py Torch 2.3 deep learning framework and 2 NVIDIA A6000 GPUs with 48GB of VRAM to accelerate the feature extraction stage.
Software Dependencies	Yes	We used the Py Torch 2.3 deep learning framework and 2 NVIDIA A6000 GPUs with 48GB of VRAM to accelerate the feature extraction stage.
Experiment Setup	Yes	Unless otherwise specified, we use the DINOv2-Vi T-B14-REG [22] backbone with parameters kmin = 1, pmax = 20, and λmax = 0.8. We apply the spectral method from Ng et al. [61] with m = 300 for superpixel clustering. The recursive partitioning depth is limited at 10 levels. Depending on each backbone downsampling factor, input images are resized to extract 60 60 codes, except for urban street scenes, where we obtain 60 120 codes.