Embedded Unsupervised Feature Selection
Authors: Suhang Wang, Jiliang Tang, Huan Liu
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on various benchmark datasets demonstrate the effectiveness of the proposed framework EUFS. |
| Researcher Affiliation | Academia | Suhang Wang, Jiliang Tang, Huan Liu School of Computing, Informatics, and Decision Systems Engineering Arizona State University, USA {suhang.wang, jiliang.tang, huan.liu}@asu.edu |
| Pseudocode | Yes | Algorithm 1 Embedded Unsupervised Feature Selection |
| Open Source Code | Yes | The implementation of EUFS can be found from http://www.public.asu.edu/ swang187/ |
| Open Datasets | Yes | ALLAML and Prostate-GE are publicly available from https://sites.google.com/site/feipingnie/file; TOX-171, PIX10P and PIE10P are publicly available from http://featureselection.asu.edu/datasets.php; COIL20 is publicly available from http://www.cad.zju.edu.cn/home/dengcai/Data/MLData.html |
| Dataset Splits | No | The paper mentions tuning parameters via grid-search and repeating K-means experiments, but does not specify explicit train/validation/test dataset splits for model training and evaluation. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | for LS, MCFS, NDFS, RUFS and EUFS, we fix the neighborhood size to be 5 for all the datasets. To fairly compare different unsupervised feature selection methods, we tune the parameters for all methods by a grid-search strategy from {10 6, 10 4, . . . , 104, 106}. For EUFS, we set the latent dimension as the number of clusters. How to determine the optimal number of selected features is still an open problem (Tang and Liu 2012a), we set the number of selected features as {50, 100, 150, ..., 300} for all datasets. |