Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Estimating counterfactual treatment outcomes over time through adversarially balanced representations

Authors: Ioana Bica, Ahmed M Alaa, James Jordon, Mihaela van der Schaar

ICLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, we evaluate CRN in a realistic set-up using a model of tumour growth (Geng et al., 2017). We show that CRN achieves better performance in predicting counterfactual outcomes, but also in choosing the right treatment and timing of treatment than current state-of-the-art methods.
Researcher Affiliation Academia Ioana Bica Department of Engineering Science University of Oxford, Oxford, UK The Alan Turing Institute, London, UK EMAIL Ahmed M. Alaa Department of Electrical Engineering University of California, Los Angeles, USA EMAIL James Jordon Department of Engineering Science University of Oxford, Oxford, UK EMAIL Mihaela van der Schaar University of Cambridge, Cambridge, UK University of California, Los Angeles, USA The Alan Turing Institute, London, UK EMAIL
Pseudocode Yes The pseudocode in Algorithm 1 shows the training procedure used for the encoder and decoder networks part of CRN.
Open Source Code Yes The implementation of the model can be found at https://bitbucket.org/mvdschaar/mlforhealthlabpub/src/master/alg/counterfactual_recurrent_network/ and at https://github.com/ioanabica/Counterfactual-Recurrent-Network.
Open Datasets Yes To validate the CRN1, we evaluate it on a Pharmacokinetic-Pharmacodynamic model of tumour growth (Geng et al., 2017), which uses a stateof-the-art bio-mathematical model to simulate the combined effects of chemotherapy and radiotherapy in lung cancer patients. ... Using the Medical Information Mart for Intensive Care (MIMIC III) (Johnson et al., 2016) database consisting of electronic health records from patients in the ICU...
Dataset Splits Yes For each γ we simulate a 10000 patients for training, 1000 for validation (hyperparameter tuning) and 1000 for out-of-sample testing.
Hardware Specification Yes The model was implemented in Tensor Flow and trained on an NVIDIA Tesla K80 GPU.
Software Dependencies No The paper states 'The model was implemented in Tensor Flow' but does not provide a specific version number for TensorFlow or any other software dependencies.
Experiment Setup Yes Table 5 shows the hyperparameter search ranges for the encoder and decoder networks in CRN. We selected hyperparameters based on the error of the model on the factual outcomes in the validation dataset. All models are trained for 100 epochs.