reproducibilityindex.ai

Learning what and where to attend

Authors: Drew Linsley, Dan Shiebler, Sven Eberhardt, Thomas Serre

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We ﬁrst describe a large-scale online experiment (Click Me) used to supplement Image Net with nearly half a million human-derived top-down attention maps. Using human psychophysics, we conﬁrm that the identiﬁed top-down features from Click Me are more diagnostic than bottom-up saliency features for rapid image categorization. As a proof of concept, we extend a state-of-the-art attention network and demonstrate that adding Click Me supervision signiﬁcantly improves its accuracy and yields visual features that are more interpretable and more similar to those used by human observers.
Researcher Affiliation	Academia	Drew Linsley, Dan Shiebler, Sven Eberhardt and Thomas Serre Department of Cognitive Linguistic & Psychological Sciences Carney Institute for Brain Science Brown University Providence, RI 02912 {drew_linsley,thomas_serre}@brown.edu
Pseudocode	No	The paper describes the architecture and processes in text and diagrams (e.g., Fig. 3) but does not include a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	See https://github.com/serre-lab/gala_tpu for a reference implementation.
Open Datasets	Yes	We ﬁrst describe a large-scale online experiment (Click Me) used to supplement Image Net with nearly half a million human-derived top-down attention maps. By supplementing Image Net with the public release of Click Me attention maps, we hope to spur interest in the development of network architectures that are not only more robust and accurate, but also more interpretable and consistent with human vision. The dataset can be downloaded from http://serre-lab.clps.brown.edu/resource/clickme.
Dataset Splits	Yes	We set aside approximately 5% of the dataset for validation (17,841 images and importance maps), another 5% for testing (17,581 images and importance maps), and the rest for training (329,036 images and importance maps).
Hardware Specification	Yes	This work was also made possible by Cloud TPU hardware resources that Google made available via the Tensor Flow Research Cloud (TFRC) program. Models trained on full versions of ILSVRC12 (Table 2 and Table 4) were trained with Google Cloud TPU v2 devices. Models trained on the Click Me subset of ILSVRC12 were trained with TITAN X Pascal GPUs (Table 1 in the main text and Table 3).
Software Dependencies	No	The paper states 'All models were implemented in Tensorﬂow' but does not specify a version number for Tensorﬂow or any other key software dependencies.
Experiment Setup	Yes	Models were trained for 100 epochs and weights were selected that yielded the best validation accuracy. All models were implemented in Tensorﬂow and were trained from scratch with weights drawn from a scaled normal distribution. We used SGD with Nesterov momentum (Sutskever et al., 2013) and a piece-wise constant learning rate schedule that decayed by 1/10 after 30, 60, 80, and 90 epochs of training. The dimensionality reduction ratio r of the shrinking operation was set to 4. This analysis demonstrated that both object categorization and Click Me map prediction improve when λ = 6.