Nonlinear Feature Extraction with Max-Margin Data Shifting
Authors: Jianqiao Wangni, Ning Chen
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The empirical results on multiple linear and nonlinear models demonstrate that MMDS can efficiently improve the performance of unsupervised extractors. |
| Researcher Affiliation | Academia | MOE Key lab of Bioinformatics, Bioinformatics Division and Center for Synthetic & Systems Biology, TNList, Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China |
| Pseudocode | Yes | The procedure of training an extractor on the MMDS data is summarized as Training Data{x} Shift T [x] Train Extractor. |
| Open Source Code | No | The paper does not provide any concrete access information for source code. |
| Open Datasets | Yes | The Yale dataset contains 165 images of 15 individuals. The Yale B (the extended Yale Face Database B) dataset includes 38 individuals and about 64 near frontal face images. The ORL dataset contains 10 different varying lighting and facial detail images for each of 40 distinct subjects. The 11 Tumor dataset contains 174 gene samples of 11 different class... The TRECVID2003 dataset contains 1078 video shots of 5 categories... The Digits dataset is within Open CV. We extract 64 dimensional HOG features (Dalal and Triggs 2005)... The Letters dataset (Ben, Carlos, and Daphne 2004)... |
| Dataset Splits | Yes | We use 5 (or the number of minimal category) folds cross-validation to find proper parameters. We randomly choose 500 samples from the MNIST dataset, and use 10,000 samples for testing. The Digits dataset... evenly split them to train/test sets. The Letters dataset... use 5,375 samples for training while the other 46,777 for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper mentions software like the 'Lib Linear package', 'Open CV', and 'L-BFGS solver' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For all models, the data are projected into 10-dimensional space (K=10). We use 5 (or the number of minimal category) folds cross-validation to find proper parameters. The dimensions of RFF is set to 500 for Digits and 1,000 for the other datasets. We use an L-BFGS solver with a fixed maximum number of iterations, and normalize the data to (0.1, 0.9) before sending to the autoencoder. Figure 3 presents the performance of various extractors under different shifting scales (i.e., σ2 in MMDS). |