Exemplar-centered Supervised Shallow Parametric Data Embedding

Authors: Martin Renqiang Min, Hongyu Guo, Dongjin Song

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also empirically demonstrate, using several benchmark datasets, that for classification in two-dimensional embedding space, our approach not only gains speedup of k NN by hundreds of times, but also outperforms state-of-the-art supervised embedding approaches. In this section, we evaluate the effectiveness of HOPE and en HOPE by comparing them against several baseline methods based upon three datasets, i.e., MNIST, USPS, and 20 Newsgroups.
Researcher Affiliation Collaboration Martin Renqiang Min NEC Labs America Princeton, NJ 08540 renqiang@nec-labs.com; Hongyu Guo National Research Council Canada Ottawa, ON K1A 0R6 hongyu.guo@nrc-cnrc.gc.ca; Dongjin Song NEC Labs America Princeton, NJ 08540 dosong@nec-labs.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the methodology is openly available.
Open Datasets Yes The MNIST dataset contains 60,000 training and 10,000 test gray-level 784-dimensional images. The USPS data set contains 11,000 256-pixel gray-level images, with 8,000 for training and 3,000 for test. The 20 Newsgroups dataset is a collection of 16,242 newsgroup documents among which we use 15,000 for training and the rest for test as in [van der Maaten, 2009].
Dataset Splits Yes We used 10% of training data as validation set to tune the number of factors (F), the number of high-order units (m), and batch size.
Hardware Specification Yes Table 4 shows the experimentally observed computational speedup of en-HOPE over standard k NN on our desktop with Intel Xeon 2.60GHz CPU and 48GB memory on different datasets.
Software Dependencies No The paper does not provide specific software names with version numbers for its dependencies.
Experiment Setup Yes When 10 exemplars are used, k = 1 for k NN, otherwise, k = 5. For HOPE and en-HOPE, we set F = 800 and m = 400 for all the datasets used. In practice, we find that the feature interaction order O = 2 often works best for all applications. The parameters for all baseline methods were carefully tuned to achieve the best results.