Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

$\text{G}^2\text{M}$: A Generalized Gaussian Mirror Method to Boost Feature Selection Power

Authors: Hongyu Shen, Zhizhen Jane Zhao

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate both theoretically and empirically that the proposed test statistics achieve higher power than those of Gaussian mirror and data splitting. Comparisons with other FDR-controlled frameworks on synthetic, semi-synthetic, and real datasets highlight the superior performance of the G2M method in achieving higher power while maintaining FDR control. These findings suggest the potential for the G2M method for practical applications in real-world problems.
Researcher Affiliation	Academia	Hongyu Shen1, Zhizhen Zhao1 1 Department of Electrical and Computer Engineering University of Illinois Urbana-Champaign EMAIL
Pseudocode	Yes	Algorithm 1 Exact algorithm 1: Input: Design matrix X, response y, nominal FDR level q and true scale δ or δj for the βj, j S1. 2: Output: ˆS1 = {j \| wj τq}. Algorithm 2 Estimation algorithm 1: Input: Design matrix X, response y, nominal FDR level q and the number of modes for δ: k p. 2: Output: Algorithm 1(X, y, {ˆδj}p j=1, q)
Open Source Code	Yes	Code is available at: https://github.com/skyve2012/G2M.
Open Datasets	Yes	Inflammatory Bowel Disease (IBD): The second dataset, publicly available from the Metabolomics Workbench 11, originates from a real-world study titled Longitudinal Metabolomics of the Human Microbiome in Inflammatory Bowel Disease (IBD) [20].
Dataset Splits	No	The paper discusses how synthetic data is generated and the total size (n,p) of datasets used, and runs experiments over 100 independent repetitions, but it does not specify explicit training/validation/test splits for any of the datasets used for evaluating the methods.
Hardware Specification	Yes	All experiments were carried out on an IBM AC922 server with 2x 20 core IBM POWER9 CPU @ 2.4GHz.
Software Dependencies	No	The paper does not explicitly list specific software libraries or tools with their version numbers that were used for implementation or experimentation.
Experiment Setup	Yes	Results are presented for FDR and power with (n, p) = (200, 100), at the FDR nominal level 0.1. All reported values are averaged over 100 independent repetitions.