Spectral Feature Scaling Method for Supervised Dimensionality Reduction
Authors: Momo Matsuda, Keiichi Morikuni, Tetsuya Sakurai
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments show that the proposed methods outperform well-established supervised methods for toy problems with more samples than features, and are more robust regarding clustering than existing methods. Also, the proposed methods outperform existing methods regarding classification for realworld problems with more features than samples of gene expression profiles of cancer diseases. |
| Researcher Affiliation | Academia | Momo Matsuda1, Keiichi Morikuni1, Tetsuya Sakurai1,2 1 University of Tsukuba 2 JST/CREST matsuda@mma.cs.tsukuba.ac.jp, morikuni@cs.tsukuba.ac.jp, sakurai@cs.tsukuba.ac.jp |
| Pseudocode | Yes | We summarize the procedures of the proposed methods in Algorithm 1. Algorithm 1 Spectral clustering/classification methods supervised using the feature scaling |
| Open Source Code | No | The paper provides links to third-party code (LPP, LFDA, KLFDA) used for comparison, but does not provide concrete access or an explicit statement about the availability of the source code for their own proposed method. |
| Open Datasets | Yes | In this subsection, we use four data sets in the Gene Expression Omunibus3 from real-world problems with more features than samples. 3https://www.ncbi.nlm.nih.gov/ |
| Dataset Splits | Yes | Table 1 gives the means and standard deviations of RI through the leave-one-out cross-validation for the classification problems for reduced dimensions ℓ= 1, 2, and 3. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU or CPU models, memory, or cloud computing instance types used for running the experiments. |
| Software Dependencies | Yes | All programs were coded and run in Matlab 2016b. |
| Experiment Setup | Yes | In the compared methods except for LFDA, we chose the optimal value of the parameter σ among 1, 10 1, and 10 2 to achieve the best accuracy. We formed the similarity matrix in (9) by using the sevennearest neighbor and taking the symmetric part. We set the dimensions of the reduced space to one for clustering problems, and one, two and three for classification problems. The reduced data samples to a low-dimension were clustered using the k-means algorithm, and classified using the one-nearest neighborhood method. We repeated clustering the same reduced data samples 20 times using the k-means algorithm and the one-nearest neighborhood method, starting with different random initial guesses. |