CSL: Class-Agnostic Structure-Constrained Learning for Segmentation Including the Unseen

Authors: Hao Zhang, Fang Li, Lu Qi, Ming-Hsuan Yang, Narendra Ahuja

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations demonstrate CSL s prowess in boosting the performance of existing algorithms spanning OOD segmentation, ZS3, and DA segmentation, consistently transcending the state-of-art across all three tasks. Through extensive experimental validation, we ascertain that CSL markedly enhances 10 prevailing techniques across all three segmentation tasks, including OOD segmentation, ZS3, and DA segmentation, consistently outstripping state-of-the-art benchmarks.
Researcher Affiliation Collaboration Hao Zhang1, Fang Li1, Lu Qi2, Ming-Hsuan Yang2, 3, Narendra Ahuja1 1University of Illinois at Urbana-Champaign 2University of California Merced 3Google Research
Pseudocode No No section labeled "Pseudocode" or "Algorithm" is present.
Open Source Code No Additional details and results for the benchmarks and implementation can be found in the supplementary material, and we plan to make the source code publicly available.
Open Datasets Yes Cityscapes (Cordts et al. 2016) including 19 seen classes are used as the training sets, while OOD images containing other classes beyond the seen classes are utilized for testing purposes. Table 5 presents a comparison of our proposed CSL method with previous state-of-the-art zero-shot semantic segmentation methods... on COCO-stuff and PASCAL VOC benchmarks. Notably, our method achieves excellent results on the Synscapes2Cityscapes benchmark, as reported in Table 6.
Dataset Splits No While the paper uses terms like "train" and "test" and mentions "SMIYC (AT)-val" in Table 4, it does not explicitly provide the split percentages or sample counts for training, validation, and testing sets, nor does it refer to specific predefined splits in a way that allows reproduction of the splitting methodology. Text includes: "In all our experiments 1, we utilize Res Net50 as the backbone and FPN as the pixel decoder. All experiments for OOD segmentation are performed without any OOD data. In the DA and ZS3 experiments, we use the same training data as the comparative methods." and "Table 4: Quantitative results on SMIYC (Anomaly Track) and Road Anomaly... SMIYC (AT)-val SMIYC (AT)-test Road Anomaly"
Hardware Specification No No specific GPU, CPU, or memory details are provided.
Software Dependencies No No software names with version numbers are mentioned (e.g., PyTorch version, CUDA version).
Experiment Setup Yes In all our experiments 1, we utilize Res Net50 as the backbone and FPN as the pixel decoder. Validity loss, region loss, and distill loss, which is the mean Huber loss between the predicted per-pixel distribution and the output of the base teacher network (only for scheme1), are used for optimization. For context, in our integration experiments with Zeg CLIP (Zhou et al. 2023) on the COCO-stuff benchmark, scheme1 demanded around 50K iterations to achieve satisfactory results, whereas scheme2 reached similar benchmarks in just 25K iterations.