Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-task learning with summary statistics

Authors: Parker Knight, Rui Duan

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our theoretical findings and the performance of the method through extensive simulations.
Researcher Affiliation Academia Parker Knight Department of Biostatistics Harvard University Boston, MA EMAIL Rui Duan Department of Biostatistics Harvard University Boston, MA EMAIL
Pseudocode No The paper describes the methods through mathematical formulations and textual explanations but does not include any pseudocode or algorithm blocks.
Open Source Code Yes The code, further implementation details, and additional simulations which explore the use of our adaptive tuning procedure are also available in the supplement.
Open Datasets Yes We use a multi-site data obtained from the electronic Medical Records and Genomics (e MERGE) network [28], which includes individual-level genotype data from multiple research sites in the United States.
Dataset Splits No We split the data (with sample sizes n1 = 3813, n2 = 546, n3 = 2666, n4 = 1435, n5 = 525) at each task into a training and test set (with a test set data size of 100 for each task) and evaluate the performance of our method using the prediction MSE on the test set.
Hardware Specification No The paper does not specify the hardware used for its experiments, such as CPU or GPU models, or cloud computing resources.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks) used for implementation or experiments.
Experiment Setup Yes We generate synthetic Gaussian data with nmin = 100, p = 100, nmin = Ο„nmin for Ο„ {0.5, 1, 2, 5, 10}, and ρq = 0 for each q. The number of tasks was fixed at 8. Furthermore, we generate a row-sparse B matrix with 10 nonzero rows and a B with rank 2 for the sparse and low-rank multi-task estimators, respectively.