Real Time Image Saliency for Black Box Classifiers
Authors: Piotr Dabkowski, Yarin Gal
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our approach on CIFAR-10 and Image Net datasets and show that the produced saliency maps are easily interpretable, sharp, and free of artifacts. We suggest a new metric for saliency and test our method on the Image Net object localisation task. We achieve results outperforming other weakly supervised methods. |
| Researcher Affiliation | Academia | Piotr Dabkowski pd437@cam.ac.uk University of Cambridge Yarin Gal yarin.gal@eng.cam.ac.uk University of Cambridge and Alan Turing Institute, London |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an unambiguous statement or link for the release of its source code. |
| Open Datasets | Yes | We test our approach on CIFAR-10 and Image Net datasets... For the encoder part of the U-Net we use Res Net-50 [3] pre-trained on Image Net [9]. ... We train each masking model as described in section 4.1 on 250,000 images from the Image Net training set. ... To verify the performance of our method on a completely different dataset we implemented our saliency detection model for the CIFAR-10 dataset [4]. |
| Dataset Splits | Yes | The localisation box has to have IOU greater than 0.5 with any of the ground truth bounding boxes in order to consider the localisation successful, otherwise, it is counted as an error. The calculated error rates for the three models are presented in table 1. ... The masking model was trained for 20 epochs. Saliency maps for sample images from the validation set are shown in figure 6. |
| Hardware Specification | No | The paper states, 'The complexity of the model is comparable to that of Res Net-50 and it can process more than a hundred 224x224 images per second on a standard GPU (which is sufficient for real-time saliency detection),' but does not specify the model of the GPU or any other hardware components. |
| Software Dependencies | No | The paper mentions using specific models like 'Alex Net [5], Google Net [14] and Res Net-50 [3]' and 'U-Net architecture [8]', 'Fit Net [7]', but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The selected parameters of the objective function are λ1 = 10, λ2 = 10 3, λ3 = 5, λ4 = 0.3. The first upsampling block has 768 output channels and with each subsequent upsampling block we reduce the number of channels by a factor of two. We train each masking model as described in section 4.1 on 250,000 images from the Image Net training set. ... During training, we set the probability of the fake label occurrence to 30%. ... In 50% of cases the image A is the blurred version of X (we use a Gaussian blur with σ = 10 to achieve a strong blur) and in the remainder of cases, A is set to a random colour image with the addition of a Gaussian noise. |