Spectral Feature Scaling Method for Supervised Dimensionality Reduction

Authors: Momo Matsuda, Keiichi Morikuni, Tetsuya Sakurai

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments show that the proposed methods outperform well-established supervised methods for toy problems with more samples than features, and are more robust regarding clustering than existing methods. Also, the proposed methods outperform existing methods regarding classification for realworld problems with more features than samples of gene expression profiles of cancer diseases.
Researcher Affiliation Academia Momo Matsuda1, Keiichi Morikuni1, Tetsuya Sakurai1,2 1 University of Tsukuba 2 JST/CREST matsuda@mma.cs.tsukuba.ac.jp, morikuni@cs.tsukuba.ac.jp, sakurai@cs.tsukuba.ac.jp
Pseudocode Yes We summarize the procedures of the proposed methods in Algorithm 1. Algorithm 1 Spectral clustering/classification methods supervised using the feature scaling
Open Source Code No The paper provides links to third-party code (LPP, LFDA, KLFDA) used for comparison, but does not provide concrete access or an explicit statement about the availability of the source code for their own proposed method.
Open Datasets Yes In this subsection, we use four data sets in the Gene Expression Omunibus3 from real-world problems with more features than samples. 3https://www.ncbi.nlm.nih.gov/
Dataset Splits Yes Table 1 gives the means and standard deviations of RI through the leave-one-out cross-validation for the classification problems for reduced dimensions ℓ= 1, 2, and 3.
Hardware Specification No The paper does not provide any specific hardware details such as GPU or CPU models, memory, or cloud computing instance types used for running the experiments.
Software Dependencies Yes All programs were coded and run in Matlab 2016b.
Experiment Setup Yes In the compared methods except for LFDA, we chose the optimal value of the parameter σ among 1, 10 1, and 10 2 to achieve the best accuracy. We formed the similarity matrix in (9) by using the sevennearest neighbor and taking the symmetric part. We set the dimensions of the reduced space to one for clustering problems, and one, two and three for classification problems. The reduced data samples to a low-dimension were clustered using the k-means algorithm, and classified using the one-nearest neighborhood method. We repeated clustering the same reduced data samples 20 times using the k-means algorithm and the one-nearest neighborhood method, starting with different random initial guesses.