reproducibilityindex.ai

Algorithmic stability and generalization of an unsupervised feature selection algorithm

Authors: xinxing wu, Qiang Cheng

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results on real-world datasets demonstrate superior generalization performance of our proposed algorithm to strong baseline methods.
Researcher Affiliation	Academia	Xinxing Wu, Qiang Cheng University of Kentucky, Lexington, Kentucky, USA
Pseudocode	No	The paper describes the algorithm conceptually and mathematically, but does not provide a formal pseudocode block or algorithm section.
Open Source Code	Yes	The main codes related to our proposed algorithm are publicly available,8 and the implementation details of baseline algorithms are provided in Supplementary Material. Footnote 8: They can be found at https://github.com/xinxingwu-uk/UFS
Open Datasets	Yes	The benchmarking datasets and their statistics are summarized in Table 1.5 Footnote 5: Datasets 1 and 4 are downloaded from http://archive.ics.uci.edu/ml/datasets/. Datasets 6-10 are from the scikit-feature feature selection repository [24].
Dataset Splits	Yes	for dataset 5, we randomly choose 6, 000 samples from the training set for training and validating and 4, 000 samples from the test set for testing. We then randomly split 6, 000 samples into training and validation sets by a ratio of 90:10. For other datasets, we randomly split the samples into training, validation, and test sets by a ratio of 72:8:20, and we tune hyperparameters on the validation set.
Hardware Specification	No	The methods discussed in our paper do not require too much computational resources. For example, Intel Core i5 and 16GB RAM, are sufﬁcient to run our algorithm, so we do not speciﬁcally explain the computing resources used in our paper.
Software Dependencies	No	The paper mentions software like the 'Adam optimizer' but does not specify version numbers for any key software components or libraries.
Experiment Setup	Yes	We set the maximum number of epochs to 200. We initialize the weights of the feature selection layer by sampling uniformly from U[0.999999, 0.9999999] and the other layers with the Xavier normal initializer.6 We adopt the Adam optimizer [20] with a learning rate of 0.001. We set λ1 to 1/27.7 We take k = 10 for dataset 1 and k = 50 for datasets 2-6 following CAE [1], and k = 64 for high-dimensional datasets 7-10. The dimension of the latent space is consistently set to k.