Selective Deep Autoencoder for Unsupervised Feature Selection

Authors: Wael Hassanieh, Abdallah Chehade

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on three high-dimensional public datasets have shown promising feature selection performance by SDAE in comparison to other existing state-of-the-art unsupervised feature selection methods.
Researcher Affiliation Academia Department of Industrial & Manufacturing Systems Engineering University of Michigan-Dearborn 4901 Evergreen Road, Dearborn, Michigan 48128 USA waelh@umich.edu, achehade@umich.edu
Pseudocode Yes Algorithm 1: Autoencoder Pre-training
Open Source Code Yes The codes can be found at https://github.com/irda-lab/SDAE.
Open Datasets Yes Mice Protein (Higuera, Gardiner, and Cios 2015) dataset includes measurements of protein expression levels in the cortex of both normal and trisomic mice. Dataset size = (1080, 561). ISOLET (Cole and Fanty 1994) comprises preprocessed speech recordings where individuals speak the names of English alphabet letters. Dataset size = (7797, 617). MNIST (Le Cun et al. 1998) contains grayscale images of hand-written digits with each image being 28-by-28 pixels in size. Dataset size = (10000, 784).
Dataset Splits Yes We randomly split it into training and testing sets by a ratio of 80 : 20. [...] Following (Abid, Balin, and Zou 2019), we randomly choose 6000 samples from the training set to train and validate and 4000 from the testing set for testing. For the other two datasets, we randomly split them into training, and testing sets by a ratio of 80 : 20.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU, GPU models, or memory specifications).
Software Dependencies Yes All experiments are done with Python 3.9.13, Tensorflow 2.10.0, Keras 2.10.0, and Scikit-Learn 1.3.0.
Experiment Setup Yes We set the maximum number of epochs to be 600. We initialize the weights of the Selective Layer by sampling uniformly from U[0.999999, 0.9999999] and the other layers with the Xavier normal initializer. We adopt the Adam optimizer (Kingma and Ba 2015) with an initialized learning rate of 0.001 with a batch size of 20. The number of hidden layers for each of the Encoder and Decoder is set to 3 with each layer holding 12 nodes. The bottleneck layer holds 6 nodes. We only use the linear activation function in SDAE for simplicity.