Do Convnets Learn Correspondence?

Authors: Jonathan L Long, Ning Zhang, Trevor Darrell

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we study the effectiveness of convnet activation features for tasks requiring correspondence. We present evidence that convnet features localize at a much finer scale than their receptive field sizes, that they can be used to perform intraclass aligment as well as conventional hand-engineered features, and that they outperform conventional features in keypoint prediction on objects from PASCAL VOC 2011 [4].
Researcher Affiliation Academia Jonathan Long Ning Zhang Trevor Darrell University of California Berkeley {jonlong, nzhang, trevor}@cs.berkeley.edu
Pseudocode No No structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures) were found in the paper.
Open Source Code No The paper states 'our network is the publicly available caffe reference model.', referring to a third-party tool used, but does not provide concrete access to source code for the methodology developed in this paper.
Open Datasets Yes We perform experiments using a network architecture almost identical1 to that popularized by Krizhevsky et al. [2] and trained for classification using the 1.2 million images of the ILSVRC 2012 challenge dataset [1]. All experiments are implemented using caffe [27], and our network is the publicly available caffe reference model.
Dataset Splits Yes We set the SVM parameter C = 10 6 for all experiments based on five-fold cross validation on the training set (see Figure 5).
Hardware Specification No No specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running experiments were provided in the paper.
Software Dependencies No The paper states 'All experiments are implemented using caffe [27]', but does not provide a specific version number for Caffe or any other software dependencies.
Experiment Setup Yes We set the SVM parameter C = 10 6 for all experiments based on five-fold cross validation on the training set (see Figure 5)... We rescale each bounding box to 500 500 and compute conv5 (with a stride of 16 pixels)... For each keypoint, we train a linear SVM with hard negative mining... We combine these to yield a final score f(Xi) = s(Xi)1 ηp(Xi)η, where η [0, 1] is a tradeoff parameter. In our experiments, we set η = 0.1 by cross validation.