Learned Region Sparsity and Diversity Also Predicts Visual Attention

Authors: Zijun Wei, Hossein Adeli, Minh Hoai Nguyen, Greg Zelinsky, Dimitris Samaras

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present here empirical evidence showing that learned region sparsity and diversity can also predict visual attention. We first describe the implementation details of RRSVM and SDR. We then consider attention prediction under three conditions: (1) single-target present, that is to find the one instance of a target category appearing in a stimulus image; (2) target absent, i.e., searching for a target category that does not appear in the image; and (3) multiple-targets present, i.e., searching for multiple object categories where at least one is present in the image. Experiments are performed on three datasets POET [26], PET [11] and MIT900 [8], which are the only available datasets for object search tasks.
Researcher Affiliation Academia Zijun Wei1 , Hossein Adeli2 , Gregory Zelinsky1,2, Minh Hoai1, Dimitris Samaras1 1. Department of Computer Science 2. Department of Psychology Stony Brook University 1.{zijwei, minhhoai, samaras}@cs.stonybrook.edu 2.{hossein.adelijelodar, gregory.zelinsky}@stonybrook.edu
Pseudocode No The paper provides mathematical formulations but does not include any explicitly labeled 'Algorithm' or 'Pseudocode' blocks.
Open Source Code No The paper mentions using a 'publicly available implementation of the AUC evaluation from the MIT saliency benchmark [5]', but this is a third-party tool and not the authors' own source code for the methodology described in the paper. No other statement about releasing their code is found.
Open Datasets Yes Experiments are performed on three datasets POET [26], PET [11] and MIT900 [8], which are the only available datasets for object search tasks. ... The SDR classifier is trained on the trainval set of PASCAL VOC 2007 dataset [9] unless otherwise stated.
Dataset Splits Yes We randomly selected one third of the images for each category to compile a validation set for tuning the width of the Gaussian blur kernel for all categories. The rest were used as test images. ... We picked a random subset of 150 images to tune the Gaussian blur parameter and reported the results for the remaining 306 images.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU model, CPU type) used to run the experiments.
Software Dependencies No The paper mentions software components like 'VGG16' and 'AUC evaluation from the MIT saliency benchmark', but it does not specify any version numbers for these or other software dependencies.
Experiment Setup Yes For SDR, the non-maxima suppression threshold is 0.5, and we only keep the top ranked regions that have non-zero region scores (si 0.01). To generate a priority map, we first associate each pixel with an integer indicating the total number of selected regions covering that pixel, then apply a Gaussian blur kernel to the integer valued map, with the kernel width tuned on the validation set.