Feature Selection using Stochastic Gates

Authors: Yutaro Yamada, Ofir Lindenbaum, Sahand Negahban, Yuval Kluger

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Furthermore, we evaluate our method using synthetic and real-life data to demonstrate that our approach outperforms other commonly used methods in both predictive performance and feature selection.
Researcher Affiliation Academia 1Department of Statistics and Data Science, Yale University, Connecticut, USA 2Program in Applied Mathematics, Yale University, Connecticut, USA 3School of Medicine, Yale University, Connecticut, USA.
Pseudocode Yes Algorithm 1 STG: Feature selection using stochastic gates
Open Source Code Yes We implemented 2 it using both the STG and HC (Louizos et al., 2017) distributions and tested on several artificial and real datasets. 2https://github.com/runopti/stg
Open Datasets Yes We begin with the MADELON dataset, a hard classification problem suggested in the NIPS 2003 feature selection challenge (Guyon et al., 2005). ... Next, we present a regression scenario based on a modification of the Friedman regression dataset (Friedman, 1991). ... Most of the datasets are collected from the ASU feature selection database available online 3. ... MNIST (Le Cun et al., 1998), RCV-1 (Lewis et al., 2004) and the PBMC (Zheng et al., 2017) ... METABRIC (Curtis et al., 2012)
Dataset Splits Yes For the above synthetic classification and regression datasets, we generate 600 samples of which we use 450 for training, 50 for validation and 100 for a test set. The hyper-parameter (controlling the number of selected features) for each method is optimized based on the validation performance.
Hardware Specification No No specific hardware details (GPU/CPU models, memory, etc.) are mentioned for running experiments.
Software Dependencies No No specific software dependencies with version numbers are mentioned in the paper.
Experiment Setup Yes The hyper-parameter (controlling the number of selected features) for each method is optimized based on the validation performance. The experiment is repeated 20 times, and the accuracy/root mean squared error (RMSE), median rank and F1-score that measures feature selection performance are presented in Fig. 2. See the supplementary material for details on the hyper-parameters of all methods.