Causal Transformer for Estimating Counterfactual Outcomes

Authors: Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our Causal Transformer based on synthetic and real-world datasets, where it achieves superior performance over current baselines. To the best of our knowledge, this is the first work proposing transformer-based architecture for estimating counterfactual outcomes from longitudinal data.
Researcher Affiliation Academia 1 LMU Munich, Munich, Germany. Correspondence to: Valentyn Melnychuk <melnychuk@lmu.de>.
Pseudocode Yes Algorithm 1 Adversarial training in CT via iterative gradient descent
Open Source Code Yes 1 Code is available online: https://github.com/ Valentyn1997/Causal Transformer
Open Datasets Yes For this, we use the MIMIC-III dataset (Johnson et al., 2016). For each level of confounding γ, we simulate 10,000 patient trajectories for training, 1,000 for validation, and 1,000 trajectories for testing.
Dataset Splits Yes For each level of confounding γ, we simulate 10,000 patient trajectories for training, 1,000 for validation, and 1,000 trajectories for testing. we split the cohort of 1,000 patients into train/validation/test subsets via a 60% / 20% / 20 % split. We then split the cohort of 5,000 patients with a ratio of 70%/15%/15% into train/validation/test subsets.
Hardware Specification Yes Experiments are carried out on 1 TITAN V GPU.
Software Dependencies No We implemented CT in Py Torch Lightning. We trained CT using Adam (Kingma & Ba, 2015)...
Experiment Setup Yes We trained CT using Adam (Kingma & Ba, 2015) with learning rate η and number of epochs ne. The dropout rate p was kept the same... For the parameters α and β of adversarial training, we choose values β = 0.99 and α = 0.01... We list the ranges of hyperparameter grids in Table 6. We report additional information on model-specific hyperparameters in Table 7...