SimPropNet: Improved Similarity Propagation for Few-shot Image Segmentation

Authors: Siddhartha Gairola, Mayur Hemani, Ayush Chopra, Balaji Krishnamurthy

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method achieves stateof-the-art results for one-shot and five-shot segmentation on the PASCAL-5i dataset. The paper includes detailed analysis and ablation studies for the proposed improvements and quantitative comparisons with contemporary methods.
Researcher Affiliation Collaboration Siddhartha Gairola2 , Mayur Hemani1 , Ayush Chopra1 and Balaji Krishnamurthy1 1Media and Data Science Research Lab, Adobe Experience Cloud 2IIIT Hyderabad siddhartha.gairola@research.iiit.ac.in, {mayur, ayuchopr, kbalaji}@adobe.com
Pseudocode No The paper describes the fusion process using equations (e.g., 'The fusion process can be concisely stated using the following three equations...') but does not present them in a structured pseudocode or algorithm block.
Open Source Code No The paper does not provide a link to its own open-source code. It mentions using 'author-provided implementations of the recent state-of-the-art methods [16] and [26]' for comparison purposes.
Open Datasets Yes With these improvements we achieve state-of-the-art performance on PASCAL 5i dataset for both one-shot and five-shot segmentation tasks. We measure the ratio over a 1000 pairs of images from the PASCAL VOC dataset [10]. The encoder comprises of three layers from a Res Net-50 network pre-trained on Image Net [14].
Dataset Splits No The paper mentions using 'different training splits' of the PASCAL-5i dataset and that 'Testing is done on the images with objects of the 5 withheld classes'. It also refers to a 'validation score', implying a validation set. However, it does not provide explicit percentages, absolute sample counts, or detailed methodology for how these splits are created, stating only that 'These splits will be released along with the paper upon acceptance.'
Hardware Specification Yes The training is done on virtual machines on Amazon AWS with four Tesla V100 GPUs and 16-core Xeon E5 processors.
Software Dependencies No No specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') are mentioned. It refers to general components like 'Res Net-50' and 'atrous spatial pyramid pooling based decoder' but not specific library versions.
Experiment Setup Yes The standard SGD optimization algorithm is used, learning rate is kept at (2.5 10 3) and the batch size is 8 for all training. Training for each split is run for 180 epochs and the checkpoint with the highest validation score (m Io U) is retained. To prevent the network from overfitting on the training classes, we also use an input regularization technique called Input Channel Averging (ICA) where the query RGB image is mapped to a grayscale input (after normalizing) with a switch probability (initialized at 0.25 for our experiments) that decays exponentially as training progresses.