reproducibilityindex.ai

kernelPSI: a Post-Selection Inference Framework for Nonlinear Variable Selection

Authors: Lotfi Slim, Clément Chatelain, Chloe-Agathe Azencott, Jean-Philippe Vert

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We first demonstrate the statistical validity of our PSI procedure, which we refer to as kernel PSI. We simulate a design matrix X of n = 100 samples and p = 50 features, partitioned in S = 10 disjoint and mutuallyindependent subgroups of p = 5 features, drawn from a normal distribution centered at 0 and with a covariance matrix Vij = ρ\|i j\|, i, j {1, p }. We set the correlation parameter ρ to 0.6. To each group corresponds a local Gaussian kernel Ki, of variance σ2 = 5. The outcome Y is drawn as Y = θK1:3U1 + ϵ, where K1:3 = K1 + K2 + K3, U1 is the eigenvector corresponding to the largest eigenvalue of K1:3, and ϵ is Gaussian noise centered at 0. We vary the effect size of θ across the range θ {0.0, 0.1, 0.2, 0.3, 0.4, 0.5}, and resample Y 1 000 times to create 1 000 simulations.
Researcher Affiliation	Collaboration	1Translational Sciences, SANOFI R&D, France 2MINES Paris Tech, PSL Research University, CBIO Centre for Computational Biology, F-75006 Paris, France 3Institut Curie, PSL Research University, INSERM, U900, F-75005 Paris, France 4Google Brain, F-75009 Paris, France.
Pseudocode	Yes	Algorithm 1 Forward stepwise kernel selection
Open Source Code	No	The paper does not provide a specific link or explicit statement about the release of its source code.
Open Datasets	Yes	Here we study the ﬂowering time phenotype FT GH of the Arabidopsis thaliana dataset of Atwell et al. (2010).
Dataset Splits	No	The paper mentions simulating data and splitting it, but does not specify explicit training/validation/test splits with percentages or counts, or cross-validation details for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies	No	The paper does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	No	The paper describes the experimental design and setup for simulations and a case study, but does not provide specific hyperparameter values (e.g., learning rate, batch size, epochs, optimizer settings) needed for reproduction.