reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

High-dimensional Varying Index Coefficient Models via Stein's Identity

Authors: Sen Na, Zhuoran Yang, Zhaoran Wang, Mladen Kolar

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conduct extensive numerical experiments to corroborate the theoretical results.
Researcher Affiliation	Academia	Sen Na EMAIL Department of Statistics University of Chicago Chicago, IL 60637, USA Zhuoran Yang EMAIL Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA Zhaoran Wang EMAIL Department of Industrial Engineering and Management Sciences Northwestern University Evanston, IL 60208, USA Mladen Kolar EMAIL The University of Chicago Booth School of Business Chicago, IL 60637, USA
Pseudocode	No	The paper describes mathematical models and estimation procedures using equations, but does not present any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available for download at: https://github.com/senna1128/Varying-Index-Coefficient-Models.
Open Datasets	Yes	The data set was obtained from Dryad at https://datadryad.org/resource/doi:10.5061/dryad.1139fm7.
Dataset Splits	Yes	The data set has n 215 samples evaluated at two locations in total, where n1 119 of them are collected from the ﬁrst population group with 45748 SNPs measured, while n2 96 of them are collected from the second population group with 59332 SNPs measured. There are 38106 SNPs in common and we select d2 250 from them uniformly at random. To make individuals independent from each other, in each group we only use the data evaluated at the ﬁrst location for the ﬁrst half of individuals and the data evaluated at the second location for the second half of individuals.
Hardware Specification	No	This work was completed in part with resources provided by the University of Chicago Research Computing Center.
Software Dependencies	No	We use default settings in CVX package (Grant and Boyd, 2008, 2012) to solve (A.1) eﬃciently.
Experiment Setup	Yes	According to Theorem 3 and 7, we set λk 30 a log d1d2{n and τ 2pn{ log d1d2q1{6. According to Theorem 9, we logpd1 d2q{npd1 d2q and λ 12 a pd1 d2q logpd1 d2q{n. The precision matrix estimator we use is deﬁned in (14) with κ2 2 a log d2{nd2, suggested by Lemma 12. According to Theorem 13, we set τ 2pn{ log d1d2q1{6 and λ 10 a log d1d2{n. The sparse precision matrix estimator is deﬁned in (A.1) and (A.2) with truncation threshold 2pn{ log d2q1{4 and γ 10 a log d2{n. We compute the sparse matrix estimator via p16q with τ n{ log d1d2 6 and λ a log d1d2{n. The sparse precision matrix is estimated by conducting CLIME procedure with γ 5 a log d2{n.