reproducibilityindex.ai

Task-Agnostic Machine-Learning-Assisted Inference

Authors: Jiacheng Miao, Qiongshi Lu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments, we showcase our method s validity, versatility, and superiority compared to existing approaches.
Researcher Affiliation	Academia	Jiacheng Miao University of Wisconsin-Madison jiacheng.miao@wisc.edu Qiongshi Lu University of Wisconsin-Madison qlu@biostat.wisc.edu
Pseudocode	Yes	Algorithm 1 PSPS for ML-assisted inference
Open Source Code	Yes	Our software is available at https://github.com/qlu-lab/psps.
Open Datasets	Yes	We used data from the UK Biobank [13], which includes 36,971 labeled and 319,548 unlabeled samples with 9,450,880 genetic variants after quality control.
Dataset Splits	Yes	Prediction in the labeled sample was implemented through cross-validation to avoid overfitting. The implementation detail is deferred to Appendix D. We select the predictive variables and train the Soft Impute model using 90% of the labeled data. We then perform predictions on the remaining 10% in each fold and repeat this process 10 times across all folds.
Hardware Specification	Yes	All our simulation is run in R with version 4.2.1 (2022-06-23) in a Mac Book Air with an M1 chip.
Software Dependencies	Yes	All our simulation is run in R with version 4.2.1 (2022-06-23) in a Mac Book Air with an M1 chip.
Experiment Setup	Yes	A pre-trained random forest with 500 trees to grow is obtained from hold-out data. We bootstrap the labeled data for 200 times for covariance estimation. All simulations are repeated 1000 times.