HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data

Authors: Konstantin Hemker, Nikola Simidjievski, Mateja Jamnik

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct multimodal survival analysis on Whole Slide Images and Multi-omic data on four cancer datasets from The Cancer Genome Atlas (TCGA). HEALNet achieves state-of-the-art performance compared to other end-to-end trained fusion models, substantially improving over unimodal and multimodal baselines whilst being robust in scenarios with missing modalities.
Researcher Affiliation Academia Konstantin Hemker Department of Computer Science & Technology University of Cambridge Cambridge, United Kingdom konstantin.hemker@cl.cam.ac.uk Nikola Simidjievski PBCI, Department of Oncology University of Cambridge Cambridge, United Kingdom ns779@cam.ac.uk Mateja Jamnik Department of Computer Science & Technology University of Cambridge Cambridge, United Kingdom mateja.jamnik@cl.cam.ac.uk
Pseudocode Yes The HEALNet pseudocode is detailed further in Appendix A.
Open Source Code Yes The code is available at https://github.com/konst-int-i/healnet.
Open Datasets Yes The results shown in this paper here are based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga. The Cancer Genome Atlas (TCGA) is an open-source genomics program run by the United State National Cancer Institute (NCI) and National Human Genome Research Institute, containing a total of 2.5 petabyts of genomic, epigenomic, transcriptomic, and proteomic data.
Dataset Splits Yes For each experiment we employ 5-folds of repeated random sub-sampling (Monte Carlo cross-validation) with a 70-15-15 split for the training, validation and test sets.
Hardware Specification Yes All experiments were run on a single Nvidia A100 80GB GPU running on a Ubuntu 22.04 virtual machine.
Software Dependencies No HEALNet is implemented in the PyTorch framework and available open-source at https://github.com/konst-int-i/healnet. All experiments were run on a single Nvidia A100 80GB GPU running on a Ubuntu 22.04 virtual machine.
Experiment Setup Yes For each experiment we employ 5-folds of repeated random sub-sampling (Monte Carlo cross-validation) with a 70-15-15 split for the training, validation and test sets. All reported results show the models performance on test data that was not used during training or validation. We re-train all of the baseline models using the code reported in the respective papers. All models have been run under the same circumstances and using the same evaluation framework (including data splits and loss weighting). For hyperparameter tuning, we ran a Bayesian Hyperparameter search [Bergstra et al., 2013] for all training parameters across models. Model-specific parameters of the baselines were tuned if the optimal parameters on the TCGA datasets were not available. The final set of hyperparameters can be found in Appendix D.