Dependence Guided Unsupervised Feature Selection
Authors: Jun Guo, Wenwu Zhu
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on different datasets consistently demonstrate that our proposed method significantly outperforms state-of-the-art baselines. In this section, we compare our proposed DGUFS approach with state-of-the-art methods on several benchmark datasets. |
| Researcher Affiliation | Academia | Jun Guo,1 Wenwu Zhu1,2 1 Tsinghua-Berkeley Shenzhen Institue, Tsinghua University, Shenzhen 518055, China 2 Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China |
| Pseudocode | Yes | Algorithm 1 Solve the optimization problem in Theorem 1 and Algorithm 2 The proposed DGUFS method |
| Open Source Code | No | The paper does not provide any statement or link regarding open-source code for the described methodology. |
| Open Datasets | Yes | Datasets: one mass spectrometry dataset ALLAML (Fodor 1997), one microarray dataset Prostate-GE (Singh et al. 2002), one cancer dataset LUNG (Bhattacharjee et al. 2001), and three face image datasets UMIST (Graham and Allinson 1998), PIX 10P5, and PIE 10P (Gross et al. 2008). Detailed information is listed in Table 1. 5http://peipa.essex.ac.uk/ipa/pix/faces/ |
| Dataset Splits | No | The paper does not explicitly provide details about training/validation/test dataset splits for reproduction. It mentions using evaluation metrics (ACC, NMI) but not how the data was partitioned into these specific sets for model training and evaluation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions algorithms and methods (e.g., ADMM, K-means) but does not list any specific software or library names with version numbers. |
| Experiment Setup | Yes | Parameter k is set to 5 for all datasets to specify the size of neighbourhood. We set the numbers of selected features as {50, 100, , 300} for all datasets. There are two major parameters to tune in our algorithm, i.e., β and α. We set β from 0.1 to 0.9 with 0.2 as interval. Generally, c n. The rank of L Rn n, i.e., rank(L) = c is usually very small. Hence, we set the corresponding parameter α larger as 101, 102, , 105. ... ρ = 1.1, μmax = 10^10, and μ = 10^-6. |