Task-Agnostic Machine-Learning-Assisted Inference

Authors: Jiacheng Miao, Qiongshi Lu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, we showcase our method s validity, versatility, and superiority compared to existing approaches.
Researcher Affiliation Academia Jiacheng Miao University of Wisconsin-Madison jiacheng.miao@wisc.edu Qiongshi Lu University of Wisconsin-Madison qlu@biostat.wisc.edu
Pseudocode Yes Algorithm 1 PSPS for ML-assisted inference
Open Source Code Yes Our software is available at https://github.com/qlu-lab/psps.
Open Datasets Yes We used data from the UK Biobank [13], which includes 36,971 labeled and 319,548 unlabeled samples with 9,450,880 genetic variants after quality control.
Dataset Splits Yes Prediction in the labeled sample was implemented through cross-validation to avoid overfitting. The implementation detail is deferred to Appendix D. We select the predictive variables and train the Soft Impute model using 90% of the labeled data. We then perform predictions on the remaining 10% in each fold and repeat this process 10 times across all folds.
Hardware Specification Yes All our simulation is run in R with version 4.2.1 (2022-06-23) in a Mac Book Air with an M1 chip.
Software Dependencies Yes All our simulation is run in R with version 4.2.1 (2022-06-23) in a Mac Book Air with an M1 chip.
Experiment Setup Yes A pre-trained random forest with 500 trees to grow is obtained from hold-out data. We bootstrap the labeled data for 200 times for covariance estimation. All simulations are repeated 1000 times.