False Discovery Proportion control for aggregated Knockoffs
Authors: Alexandre Blain, Bertrand Thirion, Olivier Grisel, Pierre Neuvial
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate FDP control and substantial power gains over existing Knockoff-based methods in various simulation settings and achieve good sensitivity/specificity tradeoffs on brain imaging and genomic data. |
| Researcher Affiliation | Academia | Alexandre Blain INRIA Université Paris-Saclay alexandre.blain@inria.fr Bertrand Thirion INRIA CEA bertrand.thirion@inria.fr Olivier Grisel INRIA olivier.grisel@inria.fr Pierre Neuvial Institut de Mathématiques de Toulouse Université de Toulouse pierre.neuvial@math.univ-toulouse.fr |
| Pseudocode | Yes | Theorem 1 is that the upper bound JER0 (t) only depends on the π0 statistics and the threshold family t, and not on the original data. Therefore, it can be estimated with arbitrary precision for any given t using Monte-Carlo simulation, as explained in the next section and described in Algorithm 1 in Supp. Mat. |
| Open Source Code | Yes | We provide a Python package containing the code for KOPI available at https://github.com/alexblnn/KOPI. |
| Open Datasets | Yes | We use the Human Connectome Project (HCP900) dataset that contains brain images of healthy young adults performing different tasks while inside an MRI scanner. Details about this dataset and empirical results can be found in Appendix E. In addition to the brain data application, we compared KOPI to other Knockoffs-based methods on gene-expression data [5] containing 79 samples and 90 genes. |
| Dataset Splits | No | We consider all 42 possible train/test pairs: the train contrast is used to obtain a ground truth, while the test contrast is used to generate the response. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments were provided in the paper. |
| Software Dependencies | No | The paper mentions providing a 'Python package' for KOPI, but no specific version numbers for Python or any other software dependencies (e.g., libraries, frameworks) were provided. |
| Experiment Setup | Yes | For methods that support aggregation, we use D = 50 Knockoff draws. We choose the central setting n = 500, p = 500, ρ = 0.5, sp = 0.1, SNR = 2. For each parameter, we explore a range of possible values to benchmark the methods across varied settings. We use the Human Connectome Project (HCP900) dataset... Inference is performed using a Lasso estimator... we set σ so that SNR = 4). |