Learning Local Invariant Mahalanobis Distances

Authors: Ethan Fetaya, Shimon Ullman

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compared running the optimization with an SVM solver (Chang & Lin, 2011), to solving it as a semidefinite problem and as a quadratic problem (relaxing the semidefinite constraint). The MNIST dataset is a well known digit recognition dataset, comprising of 28 28 grayscale images on which we perform deskewing preprocessing. For each of the 60,000 training images we computed a local Mahalanobis distance and local invariant Mahalanobis, by training each image separately using all the non-class training images as negative examples (ignoring all same-class images). As can be seen in table 1, we perform much better than exemplar SVM and are comparable with MLMNN.
Researcher Affiliation Academia Ethan Fetaya ETHAN.FETAYA@WEIZMANN.AC.IL Weizmann Institute of science Shimon Ullman SHIMON.ULLMAN@WEIZMANN.AC.IL Weizmann Institute of science
Pseudocode No The paper describes algorithms mathematically and through prose, but does not include structured pseudocode or an algorithm block.
Open Source Code No The paper does not provide any concrete access to source code for the described methodology. There are no repository links or explicit statements about code availability.
Open Datasets Yes The MNIST dataset is a well known digit recognition dataset, comprising of 28 28 grayscale images on which we perform deskewing preprocessing. LFW is a challenging dataset containing 13,233 face images of 5749 different individuals with a high level of variability. We used the aligned images (Huang et al., 2012) which we represented using HOG features (Dalal & Triggs, 2005).
Dataset Splits No For each of the 60,000 training images we computed a local Mahalanobis distance and local invariant Mahalanobis, by training each image separately using all the non-class training images as negative examples (ignoring all same-class images). The LFW dataset is divided into 10 subsets, when the task is to classify 600 pairs of images from one subset to same/not-same using the other 9 subsets as training data.
Hardware Specification No The paper discusses memory usage (e.g., '24.6Gb') and mentions running solvers on machines, but does not provide specific hardware details such as CPU/GPU models or memory specifications of the experimental machines.
Software Dependencies No The paper mentions software like 'LIBSVM (Chang & Lin, 2011)', 'YALMIP', and 'SCS (O Donoghue et al., 2013)' but does not provide specific version numbers for these tools.
Experiment Setup Yes For each of the 60,000 training images we computed a local Mahalanobis distance and local invariant Mahalanobis, by training each image separately using all the non-class training images as negative examples (ignoring all same-class images). For local invariance, we used 8 one-pixel translation and 6 small rotations as our transformations. At test time we performed knn classification with k = 3 using the local metrics. We used the aligned images (Huang et al., 2012) which we represented using HOG features (Dalal & Triggs, 2005). For each test pair (x1, x2) we compute the their local Mahalanobis matrices, M1 and M2, using the training set and use (x2 x1)T M1(x2 x1)+(x1 x2)T M2(x1 x2) as their similarity score.