reproducibilityindex.ai

IVFS: Simple and Efficient Feature Selection for High Dimensional Topology Preservation

Authors: Xiaoyun Li, Chenxi Wu, Ping Li4747-4754

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments validate the effectiveness of the proposed feature selection scheme. We conduct extensive experiments on high dimensional datasets to justify the effectiveness of IVFS. The results suggest that IVFS is capable of preserving the exact topological signatures (as well as the pairwise distances), making it a suitable tool for TDA and many other applications. We compare each method by various widely adopted metrics that can well evaluate the quality of selected feature set. Table 1 summarizes the results.
Researcher Affiliation	Industry	Xiaoyun Li, Chengxi Wu, Ping Li Cognitive Computing Lab Baidu Research 10900 NE 8th ST. Bellevue WA, 98004, USA {lixiaoyun996, wuchenxi2013, pingli98}@gmail.com
Pseudocode	Yes	Algorithm 1: IVFS scheme for feature selection
Open Source Code	No	The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository.
Open Datasets	Yes	Following many previous works on feature selection (e.g (Zhao and Liu 2007; Liu et al. 2014; Li et al. 2018b)), we carry out extensive experiments on popular highdimensional datasets from UCI repository (Asuncion and Newman 2007) and ASU feature selection database (Li et al. 2018a).
Dataset Splits	No	The paper specifies a train/test split: 'Each dataset is randomly split into 80% training sets and 20% test set'. However, it does not explicitly describe a validation set split or methodology.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU or CPU models, memory specifications, or types of computing environments used for the experiments.
Software Dependencies	No	The paper discusses various algorithms and methods implemented but does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks).
Experiment Setup	Yes	We try following combinations: d = {0.1 : 0.1 : 0.5} d, n = {100, 0.1n, 0.3n, 0.5n}. We run experiments with k = 1000, 3000, 5000. For number of neighbors, we adopt K {1, 3, 5, 10} and report the highest mean accuracy.