reproducibilityindex.ai

Energy-Based Generative Cooperative Saliency Prediction

Authors: Jing Zhang, Jianwen Xie, Zilong Zheng, Nick Barnes3280-3290

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our model can produce a set of diverse and plausible saliency maps of an image, and obtain state-of-the-art performance in both fully supervised and weakly supervised saliency prediction tasks. Experiments We conduct a series of experiments to test the performances of the proposed generative cooperative frameworks for saliency prediction.
Researcher Affiliation	Collaboration	Jing Zhang1, Jianwen Xie2, Zilong Zheng3, Nick Barnes1 1 The Australian National University 2 Cognitive Computing Lab, Baidu Research 3 University of California, Los Angeles
Pseudocode	Yes	Algorithm 1: Training the Cooperative Saliency Predictor. Algorithm 2: Cooperative learning while recovering.
Open Source Code	No	The paper states 'To demonstrate this idea, we select BASN (Qin et al. 2019) and SCRN (Wu, Su, and Huang 2019b) as base models due to the accessibility of their codes and predictions.', referring to other models' code, but does not provide any statement or link about the availability of their own open-source code for the described methodology.
Open Datasets	Yes	We use the DUTS dataset (Wang et al. 2017) to train the fully supervised model, and S-DUTS (Zhang et al. 2020b) dataset with scribble annotations to train the weakly supervised model.
Dataset Splits	No	The paper mentions 'training images' and 'training dataset' but does not specify the exact training/validation/test dataset splits (e.g., percentages, absolute counts, or predefined splits with citations).
Hardware Specification	Yes	It takes 20 hours to train the model with a batch size of 7 using a single NVIDIA Ge Force RTX 2080Ti GPU.
Software Dependencies	No	The paper mentions software components like 'Adam optimizer' and uses architectures like 'ResNet50' and 'MiDaS decoder', but does not provide specific version numbers for key software dependencies (e.g., Python, PyTorch/TensorFlow, CUDA versions).
Experiment Setup	Yes	The number of Langevin steps is K = 5 and the Langevin step sizes for EBM and LVM are 0.4 and 0.1. The learning rates of the LVM and EBM are initialized to 5 × 10−5 and 10−3 respectively. We use Adam optimizer with momentum 0.9 and decrease the learning rates by 10% after every 20 epochs. It takes 20 hours to train the model with a batch size of 7 using a single NVIDIA Ge Force RTX 2080Ti GPU.