Variational Information Maximization for Feature Selection

Authors: Shuyang Gao, Greg Ver Steeg, Aram Galstyan

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that the proposed method strongly outperforms existing information-theoretic feature selection approaches. Our experiments demonstrate that the proposed method strongly outperforms existing information-theoretic feature selection approaches. We also conduct empirical validation on various datasets and demonstrate that the proposed approach outperforms state-of-the-art information-theoretic feature selection methods.
Researcher Affiliation Academia Shuyang Gao Greg Ver Steeg Aram Galstyan University of Southern California, Information Sciences Institute gaos@usc.edu, gregv@isi.edu, galstyan@isi.edu
Pseudocode No The paper mentions that
Open Source Code Yes Shuyang Gao. Variational feature selection code. http://github.com/Biu Biu Bi LL/ Info Feature Selection.
Open Datasets Yes We use 17 well-known datasets in previous feature selection studies [5, 12] (all data are discretized). The dataset summaries are illustrated in supplementary Sec. C. We use the average cross-validation error rate on the range of 10 to 100 features to compare different algorithms under the same setting as [12]. Tenfold cross-validation is employed for datasets with number of samples N 100 and leave-one-out cross-validation otherwise. The 3-nearest-neighbor classifier is used for Gisette and Madelon, following [5]. For the remaining datasets, the chosen classifier is Linear SVM, following [11, 12]. [26] Kevin Bache and Moshe Lichman. Uci machine learning repository, 2013.
Dataset Splits Yes Tenfold cross-validation is employed for datasets with number of samples N 100 and leave-one-out cross-validation otherwise. The 3-nearest-neighbor classifier is used for Gisette and Madelon, following [5]. For the remaining datasets, the chosen classifier is Linear SVM, following [11, 12].
Hardware Specification No No specific hardware details (GPU, CPU models, memory, etc.) used for running experiments were mentioned in the paper.
Software Dependencies No No specific software dependencies with version numbers were provided.
Experiment Setup Yes We use the average cross-validation error rate on the range of 10 to 100 features to compare different algorithms under the same setting as [12]. Tenfold cross-validation is employed for datasets with number of samples N 100 and leave-one-out cross-validation otherwise. The 3-nearest-neighbor classifier is used for Gisette and Madelon, following [5]. For the remaining datasets, the chosen classifier is Linear SVM, following [11, 12].