Error-Based Knockoffs Inference for Controlled Feature Selection

Authors: Xuebin Zhao, Hong Chen, Yingjie Wang, Weifu Li, Tieliang Gong, Yulong Wang, Feng Zheng9190-9198

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations demonstrate the competitive performance of our approach on both simulated and real data.
Researcher Affiliation Academia 1College of Science, Huazhong Agricultural University, Wuhan 430062, China 2College of Informatics, Huazhong Agricultural University, Wuhan 430062, China 3School of Computer Science and Technology, Xi an Jiaotong University, Xi an 710049, China 4Department of Computer Science and Engineering, Southern University of Science and Technology, China
Pseudocode Yes Algorithm 1: Construct feature importance statistic Wj
Open Source Code No The paper states: "Full version of the paper (including the supplementary material) is at https://arxiv.org/abs/2203.04483", which is a link to the arXiv paper itself, not to source code. No other explicit statement or link for open-source code is provided.
Open Datasets Yes HIV dataset (Rhee et al. 2006)
Dataset Splits No The paper mentions dividing samples into n1 and n2 for its internal method (e.g., "n1 = n2 = 1000" and "n1 = n 3 , n2 = 2n 3 "), which are for the method's operation, not explicitly for standard train/validation/test dataset splits. It does not specify percentages or counts for a validation set.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using
Experiment Setup Yes We set the target FDR level q = 0.2 for all FDR controlled methods, set q = 0.2 and α = 0.2 for FDP control version of E-Knockoff (E-Knockoff (FDP)), and set k = 2 and α = 0.1 for k-FWER control version of EKnockoff (E-Knockoff (k-FWER)). We use Lasso as the base estimator of our E-Knockoff inference with n1 = n2 = 1000. We set p = 800, n1 = 1000 and select n2 from {200, 400, 600, 800, . . . , 2000}.