Algorithmic stability and generalization of an unsupervised feature selection algorithm
Authors: xinxing wu, Qiang Cheng
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on real-world datasets demonstrate superior generalization performance of our proposed algorithm to strong baseline methods. |
| Researcher Affiliation | Academia | Xinxing Wu, Qiang Cheng University of Kentucky, Lexington, Kentucky, USA |
| Pseudocode | No | The paper describes the algorithm conceptually and mathematically, but does not provide a formal pseudocode block or algorithm section. |
| Open Source Code | Yes | The main codes related to our proposed algorithm are publicly available,8 and the implementation details of baseline algorithms are provided in Supplementary Material. Footnote 8: They can be found at https://github.com/xinxingwu-uk/UFS |
| Open Datasets | Yes | The benchmarking datasets and their statistics are summarized in Table 1.5 Footnote 5: Datasets 1 and 4 are downloaded from http://archive.ics.uci.edu/ml/datasets/. Datasets 6-10 are from the scikit-feature feature selection repository [24]. |
| Dataset Splits | Yes | for dataset 5, we randomly choose 6, 000 samples from the training set for training and validating and 4, 000 samples from the test set for testing. We then randomly split 6, 000 samples into training and validation sets by a ratio of 90:10. For other datasets, we randomly split the samples into training, validation, and test sets by a ratio of 72:8:20, and we tune hyperparameters on the validation set. |
| Hardware Specification | No | The methods discussed in our paper do not require too much computational resources. For example, Intel Core i5 and 16GB RAM, are sufficient to run our algorithm, so we do not specifically explain the computing resources used in our paper. |
| Software Dependencies | No | The paper mentions software like the 'Adam optimizer' but does not specify version numbers for any key software components or libraries. |
| Experiment Setup | Yes | We set the maximum number of epochs to 200. We initialize the weights of the feature selection layer by sampling uniformly from U[0.999999, 0.9999999] and the other layers with the Xavier normal initializer.6 We adopt the Adam optimizer [20] with a learning rate of 0.001. We set λ1 to 1/27.7 We take k = 10 for dataset 1 and k = 50 for datasets 2-6 following CAE [1], and k = 64 for high-dimensional datasets 7-10. The dimension of the latent space is consistently set to k. |