Expand-and-Quantize: Unsupervised Semantic Segmentation Using High-Dimensional Space and Product Quantization

Authors: Jiyoung Kim, Kyuhong Shim, Insu Lee, Byonghyo Shim

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments demonstrate that EQUSS achieves state-of-the-art results on three standard benchmarks.
Researcher Affiliation Academia Jiyoung Kim, Kyuhong Shim, Insu Lee, Byonghyo Shim Department of Electrical and Computer Engineering, Seoul National University, Korea {jykim, khshim, islee, bshim}@islab.snu.ac.kr
Pseudocode No The paper describes the model architecture and training objective using descriptive text and mathematical equations, but does not include any formal pseudocode blocks or algorithms.
Open Source Code No The paper does not contain any explicit statement or link indicating that the source code for the described methodology (EQUSS) is publicly available.
Open Datasets Yes We evaluate EQUSS on three standard semantic segmentation datasets and compare with the recent SOTA methods. From our empirical experiments on Coco Stuff-27 (Caesar, Uijlings, and Ferrari 2018), Cityscapes (Cordts et al. 2016), and Potsdam-3 USS benchmarks, we show that the proposed EQUSS outperforms the recent SOTA (Hamilton et al. 2022) by a substantial margin.
Dataset Splits No The paper mentions training images and evaluation processes (linear probing, unsupervised clustering) but does not explicitly provide details about training, validation, and test dataset splits with percentages or sample counts.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud computing specifications.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) that would be required to reproduce the experiments.
Experiment Setup Yes The overall objective function is the sum of the training loss for the expansion head Lhead (Hamilton et al. 2022), codebook loss Lcodebook, and commitment loss Lcommit: L = Lhead + λ1Lcodebook + λ2Lcommit where λ1, λ2 > 0 are weighting coefficients. To study the effect of the number of codebooks (M) and the size of the codebook (K), we conduct experiments while varying M and K with fixed feature dimension.