Learning to Select Best Forecast Tasks for Clinical Outcome Prediction
Authors: Yuan Xue, Nan Du, Anne Mottram, Martin Seneviratne, Andrew M. Dai
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on a real clinical dataset demonstrate the superior predictive performance of our method compared to direct supervised learning, naive pretraining and simple multitask learning, in particular in low-data scenarios when the primary task has very few examples. With detailed ablation analysis, we further show that the selection rules are interpretable and able to generalize to unseen target tasks with new data. |
| Researcher Affiliation | Industry | Yuan Xue*, Nan Du*, Anne Mottram, Martin Seneviratne, Andrew M. Dai Google {yuanxue, dunan, annemottram, martsen, adai}@google.com |
| Pseudocode | Yes | Algorithm 1: First-Order Automatic Task Selection |
| Open Source Code | Yes | https://github.com/google-health/records-research/meta-learn-forecast-task |
| Open Datasets | Yes | We evaluate our proposed algorithm, referred to as Auto Select, using the openly accessible MIMIC-III dataset [17] |
| Dataset Splits | Yes | For each fold, we split the dataset into train/validation/test according to 80%/10%/10% based on the hash value of the patient ID, and AUC-ROC is used as the evaluation metric by default with standard error reported in the parentheses next to it. |
| Hardware Specification | No | The paper states that "All models were implemented in Tensor Flow", but it does not specify any hardware details such as GPU models, CPU types, or memory used for the experiments. |
| Software Dependencies | No | The paper mentions that "All models were implemented in Tensor Flow [20]", but it does not provide a specific version number for TensorFlow or any other software libraries. |
| Experiment Setup | Yes | The sequence-to-sequence architecture of the trajectory forecast task uses an LSTM for both encoder and decoder, where the hidden state has a dimension of 70. ... we run approximately 5,000 steps during the pretraining stage, followed by 5 epochs for finetuning. ... The learning rates of all training loops were tuned and are 0.001 for supervised learning, 0.005 for self-supervised learning, 0.01 for λ hyper-gradient update. |