kernelPSI: a Post-Selection Inference Framework for Nonlinear Variable Selection
Authors: Lotfi Slim, Clément Chatelain, Chloe-Agathe Azencott, Jean-Philippe Vert
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first demonstrate the statistical validity of our PSI procedure, which we refer to as kernel PSI. We simulate a design matrix X of n = 100 samples and p = 50 features, partitioned in S = 10 disjoint and mutuallyindependent subgroups of p = 5 features, drawn from a normal distribution centered at 0 and with a covariance matrix Vij = ρ|i j|, i, j {1, p }. We set the correlation parameter ρ to 0.6. To each group corresponds a local Gaussian kernel Ki, of variance σ2 = 5. The outcome Y is drawn as Y = θK1:3U1 + ϵ, where K1:3 = K1 + K2 + K3, U1 is the eigenvector corresponding to the largest eigenvalue of K1:3, and ϵ is Gaussian noise centered at 0. We vary the effect size of θ across the range θ {0.0, 0.1, 0.2, 0.3, 0.4, 0.5}, and resample Y 1 000 times to create 1 000 simulations. |
| Researcher Affiliation | Collaboration | 1Translational Sciences, SANOFI R&D, France 2MINES Paris Tech, PSL Research University, CBIO Centre for Computational Biology, F-75006 Paris, France 3Institut Curie, PSL Research University, INSERM, U900, F-75005 Paris, France 4Google Brain, F-75009 Paris, France. |
| Pseudocode | Yes | Algorithm 1 Forward stepwise kernel selection |
| Open Source Code | No | The paper does not provide a specific link or explicit statement about the release of its source code. |
| Open Datasets | Yes | Here we study the flowering time phenotype FT GH of the Arabidopsis thaliana dataset of Atwell et al. (2010). |
| Dataset Splits | No | The paper mentions simulating data and splitting it, but does not specify explicit training/validation/test splits with percentages or counts, or cross-validation details for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | No | The paper describes the experimental design and setup for simulations and a case study, but does not provide specific hyperparameter values (e.g., learning rate, batch size, epochs, optimizer settings) needed for reproduction. |