Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Clairvoyance: A Pipeline Toolkit for Medical Time Series

Authors: Daniel Jarrett, Jinsung Yoon, Ioana Bica, Zhaozhi Qian, Ari Ercole, Mihaela van der Schaar

ICLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through illustrative examples on real-world data in outpatient, general wards, and intensive-care settings, we illustrate the applicability of the pipeline paradigm on core tasks in the healthcare journey.
Researcher Affiliation Collaboration Daniel Jarrett University of Cambridge, UK EMAIL Ioana Bica University of Oxford, UK The Alan Turing Institute, UK EMAIL Ari Ercole University of Cambridge, UK Cambridge University Hospitals NHS Foundation Trust, UK EMAIL Jinsung Yoon Google Cloud AI, Sunnyvale, USA University of California, Los Angeles, USA EMAIL Zhaozhi Qian University of Cambridge, UK EMAIL Mihaela van der Schaar University of Cambridge, UK University of California, Los Angeles, USA The Alan Turing Institute, UK EMAIL
Pseudocode Yes Figure 3: Illustrative Usage. A prototypical structure of API calls for constructing a prediction pathway model. Clairvoyance is modularized to abide by established fit/transform/predict design patterns. (Green) ellipses denote additional configuration; further modules (treatments, sensing, uncertainty, etc.) expose similar interfaces.
Open Source Code Yes Python Software Repository: https://github.com/vanderschaarlab/clairvoyance
Open Datasets Yes Table 2: Medical Environments. We consider the range of settings, incl. outpatient, general wards, and ICU data. Dataset UKCF [80] WARDS [81] MIMIC [82]
Dataset Splits Yes In all experiments, the entire dataset is first randomly partitioned into training sets (64%), validation sets (16%), and testing sets (20%). The training set is used for model training, the validation set is used for hyperparameter tuning, and the testing set is used for the final evaluation which generates the performance metrics.
Hardware Specification Yes Our computations for the examples included in Section 4 were performed using a single NVIDIA Ge Force GTX 1080 Ti GPU, and each experiment took approximately 24 72 hours.
Software Dependencies No The paper mentions a "Python Software Repository" and uses libraries like pmdarima and refers to sklearn conceptually, but it does not specify exact version numbers for these software dependencies (e.g., "Python 3.x", "pmdarima x.y.z").
Experiment Setup Yes model_parameters = { h_dim : 100, n_layer : 2, n_head : 2, batch_size : 128, epoch : 20, model_type : model_name, learning_rate : 0.001, static_mode : Concatenate , time_mode : Concatenate , verbose : True}