Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
DeepDRK: Deep Dependency Regularized Knockoff for Feature Selection
Authors: Hongyu Shen, Yici Yan, Zhizhen Jane Zhao
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments conducted on synthetic, semi-synthetic, and real-world datasets demonstrate that our pipeline outperforms existing methods across various scenarios. |
| Researcher Affiliation | Academia | Department of Electrical and Computer Engineering1, University of Illinois at Urbana Champaign. Department of Statistics2, University of Illinois at Urbana Champaign. |
| Pseudocode | Yes | In Algorithm 1, we provide pseudo code for training the Knockoff Transformer and the swappers (i.e., the first stage shown in Figure 1). |
| Open Source Code | Yes | Deep DRK is implemented in Py Torch [45] and is accessible at: https://github.com/nowonder2000/Deep DRK. |
| Open Datasets | Yes | The first dataset contains single-cell RNA sequencing (sc RNA-seq) data from 10ˆ Genomics 7. ... The second publicly available dataset9 is from a real case study entitled Longitudinal Metabolomics of the Human Microbiome in Inflammatory Bowel Disease (IBD) [35]. |
| Dataset Splits | Yes | To fit models, we first split datasets of X into training and validation sets with an 8:2 ratio. The training sets are used for model optimization, and the validation sets are used for early stopping based on the validation loss, with a patience period of 6. |
| Hardware Specification | Yes | Experiments are conducted on a single NVIDIA V100 16GB GPU. |
| Software Dependencies | No | The paper states that 'Deep DRK is implemented in Py Torch [45]' but does not provide a specific version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We follow the model configuration in Table 3 to optimize Deep DRK. The architecture of the swappers Sω is based on [55]. Both the swappers and Xθ are trained using the Adam W optimizer [36]. During training, we alternately optimize Xθ and the swappers Sω, updating weights θ three times for each update of weights ω. This training scheme is similar to GAN training [21], though without discriminators. We apply early stopping to prevent overfitting. A pseudocode of the optimization is provided in Appendix B.2. In experiments, we set αn 0.5 universally as the dependency regularization coefficient due to its consistent performance 16. A discussion on the effect of αn is provided in Appendix G. ... Parameter Value Sω Learning Rate 1 ˆ 10 3 Xθ Learning Rate 1 ˆ 10 5 Dropout Rate 0.1 # of Epochs 200 Batch Size 64 λ1 30.0 λ2 1.0 λ3 20.0 Early Stop Tolerance 6 αn 0.5 |