Automatic Discovery and Optimization of Parts for Image Classification
Authors: Sobhan Naderi Parizi, Andrea Vedaldi, Andrew Zisserman, and Pedro Felzenszwalb
ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experiments with both HOG (Dalal & Triggs (2005)) and CNN (Krizhevsky et al. (2012)) features and improve the state-of-the-art results on the MIT-indoor dataset (Quattoni & Torralba (2009)) using CNN features. |
| Researcher Affiliation | Academia | Brown University University of Oxford Brown University |
| Pseudocode | Yes | Algorithm 1 Joint training of model parameters by optimizing O(u, w) in Equation 6. Algorithm 2 Fast optimization of the convex bound Bu(w, wold) using hard example mining. Algorithm 3 Fast QP solver for optimizing BC. |
| Open Source Code | No | The paper mentions using a third-party tool, Caffe, but it does not provide an explicit statement about releasing its own source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We evaluate our methods on the MIT-indoor dataset (Quattoni & Torralba (2009)). The hybrid network is pre-trained on images from Image Net (Deng et al. (2009)) and PLACES (Zhou et al. (2014)) datasets. |
| Dataset Splits | No | The paper states: "The dataset has 67 indoor scene classes. There are about 80 training and 20 test images per class." While it mentions training and test sets, it does not specify a separate validation split or explicit percentages/counts for data partitioning, nor does it refer to predefined splits with citations. |
| Hardware Specification | Yes | In our current implementation it takes 5 days to do joint training with 120 shared parts on the full MIT-indoor dataset on a 16-core machine using HOG features. It takes 2.5 days to do joint training with 372 parts on the full dataset on a 8 core machine using 60-dimensional PCA-reduced CNN features. |
| Software Dependencies | No | The paper states: "We extract CNN features using Caffe (Jia et al. (2014))." It mentions Caffe, but does not provide a specific version number for this or any other software dependency. |
| Experiment Setup | Yes | HOG features: We resize images (maintaining aspect ratio) to have about 2.5M pixels. We extract 32-dimensional HOG features... at multiple scales. Our HOG pyramid has 3 scales per octave... Each part filter wj models a 6 6 grid of HOG features... CNN features: We extract CNN features at multiple scales from overlapping patches of fixed size 256 256 and with stride value 256/3 = 85. We resize images (maintaining aspect ratio) to have about 5M pixels in the largest scale. We use a scale pyramid with 2 scales per octave. |