Domain constraints improve risk prediction when outcome data is missing
Authors: Sidhika Balachandar, Nikhil Garg, Emma Pierson
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show theoretically and on synthetic data that domain constraints improve parameter inference. We apply our model to a case study of cancer risk prediction, showing that the model s inferred risk predicts cancer diagnoses, its inferred testing policy captures known public health policies, and it can identify suboptimalities in test allocation. Though our case study is in healthcare, our analysis reveals a general class of domain constraints which can improve model estimation in many settings. |
| Researcher Affiliation | Academia | Sidhika Balachandar Cornell Tech Nikhil Garg Cornell Tech Emma Pierson Cornell Tech |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Details are in Appendix D and the code is at https://github.com/sidhikabalachandar/domain_constraints. |
| Open Datasets | Yes | Our data comes from the UK Biobank (Sudlow et al., 2015), which contains information on health, demographics, and genetics for the UK (see Appendix E for details). |
| Dataset Splits | No | The paper mentions evaluating on a 'test set' in Section 5.2, but does not provide specific percentages, sample counts, or explicit methodology for training, validation, and test splits needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using the 'Bayesian inference package Stan' (Carpenter et al., 2017) but does not specify a version number for it or any other software dependencies. |
| Experiment Setup | No | The paper describes the model architecture and general experimental approach in Sections 4 and 5.1, but does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed training configurations. |