Sparse Compositional Metric Learning

Authors: Yuan Shi, Aurélien Bellet, Fei Sha

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we evaluate our algorithms on several datasets against state-of-the-art metric learning methods. The results are consistent with our theoretical findings and demonstrate the superiority of our approach in terms of classification performance and scalability.
Researcher Affiliation Academia Yuan Shi and Aur elien Bellet and Fei Sha Department of Computer Science University of Southern California Los Angeles, CA 90089, USA {yuanshi,bellet,feisha}@usc.edu
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found.
Open Source Code Yes The MATLAB code for our methods is available at http://www-bcf.usc.edu/~bellet/.
Open Datasets Yes We use 6 datasets from UCI and BBC (see Table 1). The dimensionality of USPS and BBC is reduced to 100 and 200 using PCA to speed up computation. We normalize the data as in (Wang, Woznica, and Kalousis 2012) and split into train/validation/test (60%/20%/20%), except for Letters and USPS where we use 3,000/1,000/1,000. and Sentiment Analysis (Blitzer, Dredze, and Pereira 2007) is a popular dataset for multi-task learning that consists of Amazon reviews on four product types: kitchen appliances, DVDs, books and electronics.
Dataset Splits Yes We normalize the data as in (Wang, Woznica, and Kalousis 2012) and split into train/validation/test (60%/20%/20%), except for Letters and USPS where we use 3,000/1,000/1,000. and We randomly split the dataset into training (800 samples), validation (400 samples) and testing (400 samples) sets.
Hardware Specification No No specific hardware details (like GPU/CPU models, memory, or specific computer specifications) were mentioned for running the experiments.
Software Dependencies No The paper mentions 'MATLAB code' but does not provide specific version numbers for MATLAB or any other software dependencies used in the experiments.
Experiment Setup Yes We use a 3-nearest neighbor classifier in all experiments. To generate a set of locally discriminative rank-one metrics, we first divide data into regions via clustering. For each region center, we select J nearest neighbors from each class (for J = {10, 20, 50} to account for different scales), and apply Fisher discriminant analysis followed by eigenvalue decomposition to obtain the basis elements. We tune the regularization parameter on the validation data. For SCML-Global, we use a basis set of 400 elements for Vehicle, Vowel, Segment and BBC, and 1,000 elements for Letters and USPS. The number of target neighbors and imposters for our methods are set to 3 and 10 respectively. For SCML-Local... embedding dimension D is set to 40 for Vehicle, Vowel, Segment and BBC, and 100 for Letters and USPS.