Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards Open-Vocabulary Remote Sensing Image Semantic Segmentation

Authors: Chengyang Ye, Yunzhi Zhuge, Pingping Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our method outperforms other methods by a large margin, and our proposed Land Discover50K improves the performance of OVRSISS methods. The dataset and method will be publicly available.
Researcher Affiliation Academia Chengyang Ye, Yunzhi Zhuge, Pingping Zhang* School of Future Technology, School of Artificial Intelligence, Dalian University of Technology EMAIL, EMAIL
Pseudocode No The paper describes the method and architecture (e.g., Fig. 3) in detail, but it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code Yes Code & Datasets https://github.com/yecy749/GSNet
Open Datasets Yes To address the lack of generalizable datasets on OVRSISS, we present Land Discover50K. ... The dataset and method will be publicly available. Code & Datasets https://github.com/yecy749/GSNet
Dataset Splits No The paper trains on Land Discover50K and tests on other external datasets (FLAIR, FAST, ISPRS Potsdam, Flood Net). While it discusses creating subsets of Land Discover50K for ablation studies ('randomly sample 10,000, 20,000, 30,000, and 40,000 image-mask pairs from Land Discover50K to create subsets'), it does not specify explicit train/validation/test splits for the Land Discover50K dataset itself for general model evaluation or reproduction.
Hardware Specification Yes The batch size is set to 4, and we use two NVIDIA RTX 3090 GPUs for training.
Software Dependencies No Our implementation is based on Py Torch (Paszke et al. 2019) and Detectron2 (Wu et al. 2019). The paper mentions the frameworks but does not provide specific version numbers for PyTorch or Detectron2.
Experiment Setup Yes We train the GSNet with a per-pixel binary cross-entropy loss. Our implementation is based on Py Torch (Paszke et al. 2019) and Detectron2 (Wu et al. 2019). We adopt Adam W (Loshchilov and Hutter 2017) as the optimizer, setting the learning rate to 2 10 6 for the CLIP, while keeping the DINO fixed during the training process. The remaining parts of our model are randomly initialized and trained with a learning rate of 2 10 4. The batch size is set to 4, and we use two NVIDIA RTX 3090 GPUs for training. We set the total number of training iterations to 30,000.