Task-Agnostic Machine-Learning-Assisted Inference
Authors: Jiacheng Miao, Qiongshi Lu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we showcase our method s validity, versatility, and superiority compared to existing approaches. |
| Researcher Affiliation | Academia | Jiacheng Miao University of Wisconsin-Madison jiacheng.miao@wisc.edu Qiongshi Lu University of Wisconsin-Madison qlu@biostat.wisc.edu |
| Pseudocode | Yes | Algorithm 1 PSPS for ML-assisted inference |
| Open Source Code | Yes | Our software is available at https://github.com/qlu-lab/psps. |
| Open Datasets | Yes | We used data from the UK Biobank [13], which includes 36,971 labeled and 319,548 unlabeled samples with 9,450,880 genetic variants after quality control. |
| Dataset Splits | Yes | Prediction in the labeled sample was implemented through cross-validation to avoid overfitting. The implementation detail is deferred to Appendix D. We select the predictive variables and train the Soft Impute model using 90% of the labeled data. We then perform predictions on the remaining 10% in each fold and repeat this process 10 times across all folds. |
| Hardware Specification | Yes | All our simulation is run in R with version 4.2.1 (2022-06-23) in a Mac Book Air with an M1 chip. |
| Software Dependencies | Yes | All our simulation is run in R with version 4.2.1 (2022-06-23) in a Mac Book Air with an M1 chip. |
| Experiment Setup | Yes | A pre-trained random forest with 500 trees to grow is obtained from hold-out data. We bootstrap the labeled data for 200 times for covariance estimation. All simulations are repeated 1000 times. |