Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Democratizing Clinical Risk Prediction with Cross-Cohort Cross-Modal Knowledge Transfer
Authors: Qiannan Zhang, Manqi Zhou, Zilong Bai, Chang Su, Fei Wang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on real-world clinical data validate the effectiveness of our proposed model. |
| Researcher Affiliation | Academia | Qiannan Zhang, Manqi Zhou, Zilong Bai, Chang Su, Fei Wang Weill Cornell Medicine, Cornell University EMAIL |
| Pseudocode | Yes | Algorithm 1 The Training Procedure of C3M on the Source Cohort |
| Open Source Code | Yes | We release the code in https://github.com/graph-ehr/C3M. |
| Open Datasets | Yes | We leverage the national All of Us Research Platform [39] as the source cohort, and three target cohorts respectively from one local EHR data warehouse and two sub-networks (denoted as INSIGHT-A and INSIGHT-B) from the INSIGHT Clinical Research Network [1] to simulate our setting. |
| Dataset Splits | Yes | For datasets, the All of Us cohort is randomly split into training/validation/testing at a 6:2:2 ratio. |
| Hardware Specification | No | fine-tuning and evaluation on the target cohorts are both performed in CPU-only environments to assess practical deployment feasibility. |
| Software Dependencies | No | The paper mentions software components like 'Adam optimizer', 'GCN', 'Transformer encoder', 'MLP', and various hyperparameters, but does not provide specific version numbers for key software components (e.g., Python, PyTorch, CUDA). |
| Experiment Setup | Yes | To conduct graph-guided finetuning to obtain phenotypical representations, a two-layer GCN is adopted with 16 hidden units, along with a 16-dimensional embedding layer to represent 2634 medical concept nodes, and one transformation layer that transforms the foundation model output to initialize patient nodes. In addition, the transformer encoder consists of two layers with two heads, and we determine the expert number via search in {1,2,3,4}, while the gene decoder is an MLP with one hidden layer. Attention modulation is achieved using a multi-head attention mechanism with two heads. Both the teacher and student models are implemented as multi-layer perceptrons. The trade-off parameter β for gene feature reconstruction is selected via grid search over {0.01, 0.05, 0.1, 0.5, 1} and set as 0.1. The trade-off parameter λKD of knowledge distillation for the student model is set as 0.01 by grid search over {0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0}. The learning rate of C3M and baseline models is selected from {0.01, 0.001, 0.0005}. |