Learning Deep Features for Scene Recognition using Places Database

Authors: Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, Aude Oliva

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using CNN, we learn deep features for scene recognition tasks, and establish new state-of-the-art results on several scene-centric datasets.Table 1: Classification accuracy on the test set of Places 205 and the test set of SUN 205.
Researcher Affiliation Academia 1Massachusetts Institute of Technology 2Princeton University 3Universitat Oberta de Catalunya
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes The database and pre-trained networks are available at http://places.csail.mit.edu
Open Datasets Yes Here we introduce Places, a scene-centric image dataset 60 times larger than the SUN database [24]. With this database and a standard CNN architecture, we establish new baselines of accuracies on various scene datasets (Scene15 [17, 13], MIT Indoor67 [19], SUN database [24], and SUN Attribute Database [18])... The database and pre-trained networks are available at http://places.csail.mit.edu
Dataset Splits Yes we randomly select 2,448,873 images from 205 categories of Places (referred to as Places 205) as the train set, with minimum 5,000 and maximum 15,000 images per category. The validation set contains 100 images per category and the test set contains 200 images per category (a total of 41,000 images).
Hardware Specification Yes Places-CNN is trained using the Caffe package on a GPU NVIDIA Tesla K40.
Software Dependencies No The paper mentions 'Caffe package' and 'linear SVM' but does not provide specific version numbers for these software components.
Experiment Setup Yes It took about 6 days to finish 300,000 iterations of training.The network architecture of Places-CNN is the same as the one used in the Caffe reference network [10].The classifier is a linear SVM with the same default parameters for the two deep features (C=1) [6].