Latent Semantic Representation Learning for Scene Classification

Authors: Xin Li, Yuhong Guo

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments conducted on standard scene recognition tasks demonstrate the efficacy of the proposed approach, comparing to the state-of-the-art scene recognition methods.
Researcher Affiliation Academia Xin Li XINLI@TEMPLE.EDU Yuhong Guo YUHONG@TEMPLE.EDU Department of Computer and Information Sciences, Temple University, Philadelphia, PA 19122, USA
Pseudocode Yes Algorithm 1 Projected gradient descent algorithm
Open Source Code No The paper does not provide any statement or link regarding the public availability of its source code.
Open Datasets Yes We evaluated the proposed method on 3 standard scene datasets: MIT Label Me Urban and Natural Scene (Label Me) (Oliva & Torralba, 2001), 15 Natural Scene dataset (Lazebnik et al., 2006) (Scene 15) and UIUC Sports (Li & Fei-Fei, 2007).
Dataset Splits Yes In all experiments, we randomly selected 80 images per category for training and used the rest for testing for all methods except the convolutional neural networks which need more training data. [...] In each experiment, we used 5-fold cross-validation technique to select the trade-off parameters for all methods.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions types of models and general approaches but does not provide specific software library names with version numbers, such as "Python 3.x" or "TensorFlow X.Y.Z".
Experiment Setup Yes For the proposed method, we conducted parameter selection for the trade-off parameters γg and γz from the set [0.005, 0.05, 0.1, 0.5, 1, 5], and performed selection for µ from the set [0.1, 0.5, 1, 5, 10], while setting γf = 0.5 and all {αi} as 1. We treated each image as a bag of 16 16 patches and extracted a HOG feature vector with length 72 (Dalal & Triggs, 2005) from each patch. We further normalized each HOG vector to have unit L2-norm.