Bidirectional Reciprocative Information Communication for Few-Shot Semantic Segmentation

Authors: Yuanwei Liu, Junwei Han, Xiwen Yao, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Nian Liu, Fahad Shahbaz Khan

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on both PASCAL5i and COCO-20i datasets validate the effectiveness of our approach.
Researcher Affiliation Academia Yuanwei Liu 1 Junwei Han 1 2 Xiwen Yao 1 Salman Khan 3 4 Hisham Cholakkal 3 Rao Muhammad Anwer 3 Nian Liu 3 Fahad Shahbaz Khan 3 5 1Northwestern Polytechnical University 2Institute of Artificial Intelligence, Hefei Comprehensive National Science Center 3Mohamed bin Zayed University of Artificial Intelligence 4Australian National University 5CVL, Linkoping University. Correspondence to: Nian Liu <liunian228@gmail.com>, Junwei Han <junweihan2010@gmail.com>.
Pseudocode No The paper provides mathematical equations and describes procedures in prose, but it does not include any explicitly labeled pseudocode blocks or algorithms.
Open Source Code Yes The code is available at https://github.com/LIUYUANWEI98/IFRNet.
Open Datasets Yes Datasets. To ensure a fair assessment, our model is evaluated on two benchmark datasets for FSS: the PASCAL-5i dataset (Shaban et al., 2017) and the COCO-20i dataset (Nguyen & Todorovic, 2019). The PASCAL-5i is built upon the PASCAL VOC 2012 dataset (Everingham et al., 2010) with additional annotations from SDS (Hariharan et al., 2011), containing 20 categories across four folds. The COCO-20i dataset, a larger dataset derived from MSCOCO (Lin et al., 2014), consists of 80 categories divided into four folds.
Dataset Splits Yes The dataset is divided into a training set Dtrain and a test set Dtest, with the base categories Ctrain and the novel categories Ctest, respectively, where Ctrain Ctest = . The model learns from Dtrain and is evaluated on Dtest. During training, episodes are created from Dtrain, where K + 1 image-mask pairs of the same base category form one episode. Among them, K pairs are treated as the support set S, while the remaining pair is used as the query set Q. The model uses both S and the query image Iq to predict the mask of the query. Model parameters are optimized under the supervision of the query mask Mq. The testing phase is similar but uses data from Dtest, and the query mask Mq is used to assess the model performance on novel categories. ... Our model is trained on three folds and evaluated on the remaining fold, enabling us to perform cross-validation.
Hardware Specification Yes All experiments are conducted using Py Torch on an NVIDIA RTX 2080 TI GPU for PASCAL-5i, and four GPUs for COCO-20i.
Software Dependencies No The paper states that "All experiments are conducted using Py Torch" but does not specify a version number for PyTorch or any other software libraries or dependencies, which is required for reproducibility.
Experiment Setup Yes For training, we use the stochastic gradient descent (SGD) optimizer, with a batch size of 4, a learning rate of 0.025, a weight decay of 0.0001, and a momentum value of 0.9. β, λ, and γ are all set as 1.0 for simplicity. The model is trained for 200 epochs on PASCAL-5i and 50 epochs on COCO-20i, with a polynomial annealing policy for learning rate reduction, using a power factor of 0.9.