reproducibilityindex.ai

Domain constraints improve risk prediction when outcome data is missing

Authors: Sidhika Balachandar, Nikhil Garg, Emma Pierson

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show theoretically and on synthetic data that domain constraints improve parameter inference. We apply our model to a case study of cancer risk prediction, showing that the model s inferred risk predicts cancer diagnoses, its inferred testing policy captures known public health policies, and it can identify suboptimalities in test allocation. Though our case study is in healthcare, our analysis reveals a general class of domain constraints which can improve model estimation in many settings.
Researcher Affiliation	Academia	Sidhika Balachandar Cornell Tech Nikhil Garg Cornell Tech Emma Pierson Cornell Tech
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Details are in Appendix D and the code is at https://github.com/sidhikabalachandar/domain_constraints.
Open Datasets	Yes	Our data comes from the UK Biobank (Sudlow et al., 2015), which contains information on health, demographics, and genetics for the UK (see Appendix E for details).
Dataset Splits	No	The paper mentions evaluating on a 'test set' in Section 5.2, but does not provide specific percentages, sample counts, or explicit methodology for training, validation, and test splits needed to reproduce the data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions using the 'Bayesian inference package Stan' (Carpenter et al., 2017) but does not specify a version number for it or any other software dependencies.
Experiment Setup	No	The paper describes the model architecture and general experimental approach in Sections 4 and 5.1, but does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed training configurations.