ICNet: Intra-saliency Correlation Network for Co-Saliency Detection

Authors: Wen-Da Jin, Jun Xu, Ming-Ming Cheng, Yi Zhang, Wei Guo

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on three benchmarks show that our ICNet outperforms previous state-of-the-art methods on Co-SOD. Ablation studies validate the effectiveness of our contributions. The PyTorch code is available at https://github.com/blanclist/ICNet.
Researcher Affiliation Academia Wen-Da Jin1 Jun Xu2 Ming-Ming Cheng2 Yi Zhang1 Wei Guo1 1College of Intelligence and Computing, Tianjin University, Tianjin, China 2TKLNDST, CS, Nankai University, Tianjin, China {jwd331,yizhang}@tju.edu.cn, {csjunxu,cmm}@nankai.edu.cn
Pseudocode No The paper describes the proposed method using figures and equations, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The PyTorch code is available at https://github.com/blanclist/ICNet.
Open Datasets Yes The training set is a subset of the COCO dataset [17], containing 9213 images, as suggested by [13, 32, 43].
Dataset Splits No The paper mentions a "training set" and "test phases" but does not explicitly describe a separate validation dataset split with specific percentages or sample counts for hyperparameter tuning. Table 4 refers to `ntrain` and `ntest`, but not `nval`.
Hardware Specification Yes The training and test are performed on an Nvidia Titan Xp GPU.
Software Dependencies No Our ICNet is implemented in PyTorch [22]. While PyTorch is named, a specific version number is not provided in the text.
Experiment Setup Yes The additional parameters in our proposed modules and the last three layers are initialized with the random normal distribution of which µ = 0, σ = 0.1. We use Adam [12] as the optimizer to train our ICNet with 60 epochs. The learning rate is 10^-5, and the weight decay is 10^-4. All images are resized into 224 224 in both training and test phases. The training images are randomly flipped horizontally for augmentation. In each training iteration, we randomly select a batch of 10 images from an image group due to limited GPU memory.