reproducibilityindex.ai

Elementary Estimators for High-Dimensional Linear Regression

Authors: Eunho Yang, Aurelie Lozano, Pradeep Ravikumar

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We analyze our estimators in the high-dimensional setting, and moreover provide empirical corroboration of its performance on simulated as well as real world microarray data. We demonstrate the performance of our elementary estimators on simulated as well as real-world datasets.
Researcher Affiliation	Collaboration	Eunho Yang EUNHO@CS.UTEXAS.EDU Department of Computer Science, The University of Texas, Austin, TX 78712, USA Aur elie C. Lozano ACLOZANO@US.IBM.COM IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA Pradeep Ravikumar PRADEEPR@CS.UTEXAS.EDU Department of Computer Science, The University of Texas, Austin, TX 78712, USA
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access information (e.g., repository link, explicit statement of code release) for the source code of the methodology described.
Open Datasets	Yes	We used microarray data pertaining to isoprenoid biosynthesis in Arabidopsis thaliana (A. thaliana) provided by Wille et al. (2004).
Dataset Splits	Yes	Thus, as is standard with high-dimensional regularized convex programs, we set the tuning parameters in a holdout-validated fashion, as those that minimize the average squared error on an independent validation set of sample size n. The tuning parameters were selected using 5 fold cross-validation.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments (e.g., exact GPU/CPU models, processor types, or memory amounts).
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with versions) needed to replicate the experiment.
Experiment Setup	Yes	We set the number of samples to n = 1000, and the number of covariates among p {1000, 2000}. For each simulation, the entries of the true model coefﬁcient vector θ are set to be 0 everywhere, except for a randomly chosen subset of 10 coefﬁcients, which are chosen independently and uniformly in the interval (1, 3). There are 131 samples. All variables are log transformed. We evaluate the predictive accuracy of the methods by randomly partitioning the data into training and test sets, using 90 observations for training and the remainder for testing. The tuning parameters were selected using 5 fold cross-validation.