reproducibilityindex.ai

DeepPINK: reproducible feature selection in deep neural networks

Authors: Yang Lu, Yingying Fan, Jinchi Lv, William Stafford Noble

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we apply Deep PINK (Deep feature selection using Paired-Input Nonlinear Knockoffs) to both simulated and real data sets to demonstrate its empirical utility. 2
Researcher Affiliation	Academia	Yang Young Lu Department of Genome Sciences University of Washington Seattle, WA 98195 ylu465@uw.edu; Yingying Fan Data Sciences and Operations Department Marshall School of Business University of Southern California Los Angeles, CA 90089 fanyingy@marshall.usc.edu; Jinchi Lv Data Sciences and Operations Department Marshall School of Business University of Southern California Los Angeles, CA 90089 jinchilv@marshall.usc.edu; William Stafford Noble Department of Genome Sciences and Department of Computer Science and Engineering University of Washington Seattle, WA 98195 william-noble@uw.edu
Pseudocode	No	The paper describes the Deep PINK architecture and its components but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	All code and data will be available here: github.com/younglululu/Deep PINK.
Open Datasets	Yes	We use synthetic data to compare the performance of Deep PINK to existing methods in the literature. We also apply Deep PINK to two real data sets to demonstrate its empirical utility. We ﬁrst apply Deep PINK to the task of identifying mutations associated with drug resistance in HIV-1 [32]. We use a cross-sectional study of n = 98 healthy volunteers to investigate the dietary effect on the human gut microbiome [12, 26, 45].
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning for training, validation, or testing.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions using 'Adam [24]' for training, but it does not provide specific ancillary software details like library or solver names with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	In this work, we use an MLP with 2 hidden layers, each containing p neurons. We use L1-regularization in the MLP with regularization parameter set to O( q / n ). We use Adam [24] to train the deep learning model with respect to the mean squared error loss, using an initial learning rate of 0.001 and batch size 10.