Deep Direct Likelihood Knockoffs
Authors: Mukund Sudarshan, Wesley Tansey, Rajesh Ranganath
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study the performance of DDLK on several synthetic, semi-synthetic, and real-world datasets. We compare DDLK with several non-Gaussian knockoff generation methods: Auto-Encoding Knockoffs (AEK) [15], Knockoff GAN [10], and Deep Knockoffs [20]. |
| Researcher Affiliation | Academia | Mukund Sudarshan Courant Institute of Mathematical Sciences New York University sudarshan@cims.nyu.edu Wesley Tansey Department of Epidemiology and Biostatistics Memorial Sloan Kettering Cancer Center tanseyw@mskcc.org Rajesh Ranganath Courant Institute of Mathematical Sciences Center for Data Science New York University rajeshr@cims.nyu.edu |
| Pseudocode | Yes | Algorithm 1 DDLK |
| Open Source Code | No | The paper mentions downloading code for comparison methods and refers to Appendix D for hyperparameters, but it does not explicitly state that the code for DDLK itself is open-source or provide a link to it. |
| Open Datasets | Yes | We use RNA expression data of 963 cancer cell lines from the Genomics of Drug Sensitivity in Cancer study [24]. |
| Dataset Splits | Yes | We split the data into a training set (70%) to fit each knockoff method, a validation set (15%) used to tune the hyperparameters of each method, and a test set (15%) for evaluating knockoff statistics. |
| Hardware Specification | Yes | We run each experiment on a single CPU with 4GB of memory. |
| Software Dependencies | No | The paper mentions using frameworks and models like PyTorch Lightning, mixture density networks, MADE, and gradient boosted regression trees, but it does not specify concrete version numbers for any of these software dependencies. |
| Experiment Setup | Yes | Across each benchmark involving DDLK, we vary only the λ entropy regularization parameter based on the amount of dependence among covariates. The number of parameters, learning rate, and all other hyperparameters are kept constant. ... We let the DDLK entropy regularization parameter λ = 0.1. In this experiment, we use gradient boosted regression trees [6, 11] as our ˆqresponse(y | x; γ) model, and expected log-likelihood as a knockoff statistic. |