Stochastic Neighbor Compression

Authors: Matt Kusner, Stephen Tyree, Kilian Weinberger, Kunal Agrawal

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental on 4 of 7 data sets it yields lower test error than k NN on the entire training set, even at compression ratios as low as 2%; finally, the SNC compression leads to impressive speed ups over k NN even when k NN and SNC are both used with ball-tree data structures, hashing, and LMNN dimensionality reduction demonstrating that it is complementary to existing state-of-the-art algorithms to speed up k NN classification and leads to substantial further improvements.
Researcher Affiliation Academia Matt J. Kusner MKUSNER@WUSTL.EDU Stephen Tyree SWTYREE@WUSTL.EDU Kilian Weinberger KILIAN@WUSTL.EDU Kunal Agrawal KUNAL@WUSTL.EDU Washington University in St. Louis, 1 Brookings Dr., St. Louis, MO 63130
Pseudocode Yes Algorithm 1 SNC in pseudo-code.
Open Source Code Yes Implementation. We optimize Z by minimizing (5) with conjugate gradient descent (we use a freely-available Matlab implementation 1) and provide our implementation of SNC as open source available for download at http: //tinyurl.com/msovcfu.
Open Datasets Yes Dataset descriptions. We evaluate SNC and other training set reduction baselines on seven classification datasets detailed in Table 1. Yale Faces (Georghiades et al., 2001)... Isolet1... Letters1... Adult1... W8a2... MNIST3... Forest1... 1http://tinyurl.com/uci-ml-data 2http://tinyurl.com/libsvm-data 3http://tinyurl.com/mnist-data 4http://tinyurl.com/usps-data
Dataset Splits Yes Neither Yale Faces nor Forest have predefined test sets and so we report the average and standard deviations in performance over 5 and 10 splits, respectively. ... For LSH we cross-validate over the number of tables and hash functions and select the fastest setting that has equal or less leave-one-out error compared to k NN without LSH (for larger datasets, we performed the LSH cross-validation on class-balanced subsamples of the training set: 10% subsamples of Adult, W8a and MNIST, and 5% of Forest).
Hardware Specification Yes All experiments were performed on an 8-core Intel L5520 CPU with 2.27GHz clock frequency.
Software Dependencies No The paper mentions using a 'Matlab implementation' for conjugate gradient descent, but does not provide specific version numbers for Matlab or any other software dependencies used in the experiments.
Experiment Setup Yes In our experiments, we initialize γ2 with cross-validation and optimize it prior to learning. We pick the initialization that yields minimal training error.