Processing Megapixel Images with Deep Attention-Sampling Models
Authors: Angelos Katharopoulos, Francois Fleuret
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This new method is evaluated on three classification tasks, where we show that it allows to reduce computation and memory footprint by an order of magnitude for the same accuracy as classical architectures. We also show the consistency of the sampling that indeed focuses on informative parts of the input images. |
| Researcher Affiliation | Academia | 1Idiap Research Institute, Martigny, Switzerland 2EPFL, Lausanne, Switzerland. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code used for the experiments can be found in https://github.com/idiap/attention-sampling. |
| Open Datasets | Yes | We evaluate attention sampling on an artificial dataset based on the MNIST digit classification task (Le Cun et al., 2010). We evaluate attention sampling on the colon cancer dataset introduced by Sirinukunwattana et al. (2016)... We use a subset of the Swedish traffic signs dataset (Larsson & Felsberg, 2011). |
| Dataset Splits | No | For Megapixel MNIST, the paper states: 'We use 5000 images for training and 1000 for testing.' For the colon cancer and speed limits datasets, specific training and testing set sizes are given, but no explicit mention of a separate validation set or its size is provided. |
| Hardware Specification | No | The paper only mentions 'standard single GPU setup' and 'peak GPU memory allocated... as reported by the Tensor Flow profiler' without providing specific GPU models, CPU types, or detailed hardware specifications. |
| Software Dependencies | No | The paper mentions 'Tensor Flow (Abadi et al., 2016)' but does not provide specific version numbers for TensorFlow or any other software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | We train our models 5 independent runs for 500 epochs... we introduce an entropy regularizer for the attention distribution... normalizing the features in terms of the L2 norm... For ATS, the attention network is a three layer convolutional network and the feature network is inspired from Le Net-1 (Le Cun et al., 1995)... For Deep MIL, we extract 2,500 patches per image at a regular grid... input patches of size 27x27... For Deep MIL, we extract 192 patches on a grid 12x16 of patch size 100x100... we use a crossentropy loss weighted with the inverse of the prior of each class. We perform 3 independent runs... |