Learning Hierarchical Image Segmentation For Recognition and By Recognition

Authors: Tsung-Wei Ke, Sangwoo Mo, Stella X. Yu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our model surpasses Vi T in unsupervised part-whole discovery, semantic segmentation, image classification, and efficiency. Notably, our model (trained on unlabeled 1M Image Net images) outperforms SAM (trained on 11M images and 1 billion masks) by absolute 8% in m Io U on Part Image Net object segmentation.
Researcher Affiliation Academia Tsung-Wei Ke 1 Sangwoo Mo 2 Stella X. Yu1,2 1University of California, Berkeley 2University of Michigan, Ann Arbor {twke,stellayu}@berkeley.edu {swmo,stellayu}@umich.edu
Pseudocode Yes We present the pseudo code of the Graph Pooling module and our CAST framework. Algorithm 1: Graph Pool. Algorithm 2: Overall framework.
Open Source Code Yes *Equal contribution. Code available at https://github.com/twke18/CAST.
Open Datasets Yes Our model (trained on unlabeled 1M Image Net images) outperforms SAM (trained on 11M images and 1 billion masks) by absolute 8% in m Io U on Part Image Net object segmentation. ... Image Net (Deng et al., 2009) is a generic image classification dataset, annotated with 1, 000 object categories (IN-1K). ... MSCOCO (Lin et al., 2014) is a generic scene dataset
Dataset Splits Yes Image Net (Deng et al., 2009)... The training and validation set includes 1.28M and 50K images, respectively. ... Pascal VOC 2012 (Everingham et al., 2010)... We use the augmented training set (Hariharan et al., 2011) with 10, 582 images and the validation set with 1, 449 images. ... ADE20K Zhou et al. (2019)... The dataset includes 20, 210 and 2, 000 images for training and validation. ... Pascal Context is also a scene dataset with 4, 996 and 5, 104 images for training and validation
Hardware Specification Yes Our system comprises a 32GB Nvidia Titan V GPU card and two Intel(R) Xeon(R) CPU E5-2630 v4 processors, totaling 20 CPU cores.
Software Dependencies No The paper mentions 'Py Torch machine learning framework' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes We list the hyper-parameters for training using Mo Co and Dei T framework in Table 11 and Table 12. We mostly follow the default hyper-parameters used in each framework. We set the same batch_size and total_epochs as our CAST and Vi T baselines.