reproducibilityindex.ai

AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation

Authors: Chaofan Ma, Yang Yuhuan, Chen Ju, Fei Zhang, Ya Zhang, Yanfeng Wang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the effectiveness, we annotate three types of datasets with attribute descriptions, and conduct extensive experiments and ablation studies.
Researcher Affiliation	Academia	Chaofan Ma1, Yuhuan Yang1, Chen Ju1, Fei Zhang1, Ya Zhang1,2, Yanfeng Wang1,2B 1 Coop. Medianet Innovation Center, Shanghai Jiao Tong University 2 Shanghai AI Laboratory {chaofanma, yangyuhuan, ju_chen, ferenas, ya_zhang, wangyanfeng622}@sjtu.edu.cn
Pseudocode	No	The paper describes the method using figures and textual descriptions but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or a direct link to a code repository for the described methodology.
Open Datasets	Yes	To evaluate the significance of attribute understanding for OVSS, we annotate attribute descriptions on three types of datasets, namely, PASCAL series [13, 16, 35], COCO series [28, 8], and Fantastic Beasts. PASCAL-5i contains 20 categories that are divided into 4 folds of 5 classes each, i.e., {5i}3 i=0. COCO-20i is more challenging with 80 categories that are also divided into 4 folds, i.e., {20i}3 i=0, with each fold having 20 categories.
Dataset Splits	Yes	PASCAL-5i contains 20 categories that are divided into 4 folds of 5 classes each, i.e., {5i}3 i=0. COCO-20i is more challenging with 80 categories that are also divided into 4 folds, i.e., {20i}3 i=0, with each fold having 20 categories. Of the four folds in the two datasets, one is used for evaluation, while the other three are used for training. We evaluate on the 1.5k validation images with 20 categories (PAS-20).
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions software components like CLIP and Adam W optimizer but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We adopt CLIP Vi T-L and Res Net101 as our backbone, and choose aggregation stages L = 4. Numbers of learnable cluster in each stage are (15, 10, 5, 1). During training, the sampled attributes N = 15. Adam W optimizer is used with Cosine LRScheduler by first warm up 10 epochs from initial learning rate 4e-6 to 1e-3, and the weight decay is set to 0.05.