Where to Pay Attention in Sparse Training for Feature Selection?
Authors: Ghada Sokar, Zahra Atashgahi, Mykola Pechenizkiy, Decebal Constantin Mocanu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We performed extensive experiments on 10 datasets of different types, including image, speech, text, artificial, and biological. They cover a wide range of characteristics, such as low and high-dimensional feature spaces, and few and large training samples. |
| Researcher Affiliation | Academia | Ghada Sokar Eindhoven University of Technology g.a.z.n.sokar@tue.nl Zahra Atashgahi University of Twente z.atashgahi@utwente.nl Mykola Pechenizkiy Eindhoven University of Technology m.pechenizkiy@tue.nl Decebal Constantin Mocanu University of Twente Eindhoven University of Technology d.c.mocanu@utwente.nl |
| Pseudocode | Yes | Algorithm 1 WAST |
| Open Source Code | Yes | Code is available at https://github.com/Ghada Sokar/WAST. |
| Open Datasets | Yes | We evaluate our method on 10 publicly available datasets, including image, speech, text, time series, biological, and artificial data. They have a variety of characteristics, such as low and high-dimensional features and a small and large number of training samples. Details are in Table 1. (Table 1 lists datasets like Madelon [24], USPS [32], MNIST [36] with citations). |
| Dataset Splits | No | The paper provides train and test splits in Table 1 (e.g., 'Train' and 'Test' columns). However, it does not explicitly mention or quantify a separate 'validation' split used for hyperparameter tuning or model selection, which is distinct from the training and testing sets. |
| Hardware Specification | No | NN-based and classical methods are trained on Nvidia GPUs and CPUs, respectively. |
| Software Dependencies | No | We implemented WAST and QS [4] with Py Torch [58] |
| Experiment Setup | Yes | For all NN-based methods except CAE [5], we use a single hidden layer of 200 neurons. The architecture of CAE consists of two layers. The size of the hidden layers is dependent on the chosen K; [K, 3 2K]. For WAST and QS, we use a sparsity level of 0.8. Following [4], we report the accuracy of NN-based baselines after 100 epochs unless stated otherwise. ... For WAST, we train the model for 10 epochs. Following [4], we add a Gaussian noise with a factor of 0.2 to the input in WAST and QS [4]. Details of the hyperparameters are in Appendix A.1. |