Causal Transformer for Estimating Counterfactual Outcomes
Authors: Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our Causal Transformer based on synthetic and real-world datasets, where it achieves superior performance over current baselines. To the best of our knowledge, this is the first work proposing transformer-based architecture for estimating counterfactual outcomes from longitudinal data. |
| Researcher Affiliation | Academia | 1 LMU Munich, Munich, Germany. Correspondence to: Valentyn Melnychuk <melnychuk@lmu.de>. |
| Pseudocode | Yes | Algorithm 1 Adversarial training in CT via iterative gradient descent |
| Open Source Code | Yes | 1 Code is available online: https://github.com/ Valentyn1997/Causal Transformer |
| Open Datasets | Yes | For this, we use the MIMIC-III dataset (Johnson et al., 2016). For each level of confounding γ, we simulate 10,000 patient trajectories for training, 1,000 for validation, and 1,000 trajectories for testing. |
| Dataset Splits | Yes | For each level of confounding γ, we simulate 10,000 patient trajectories for training, 1,000 for validation, and 1,000 trajectories for testing. we split the cohort of 1,000 patients into train/validation/test subsets via a 60% / 20% / 20 % split. We then split the cohort of 5,000 patients with a ratio of 70%/15%/15% into train/validation/test subsets. |
| Hardware Specification | Yes | Experiments are carried out on 1 TITAN V GPU. |
| Software Dependencies | No | We implemented CT in Py Torch Lightning. We trained CT using Adam (Kingma & Ba, 2015)... |
| Experiment Setup | Yes | We trained CT using Adam (Kingma & Ba, 2015) with learning rate η and number of epochs ne. The dropout rate p was kept the same... For the parameters α and β of adversarial training, we choose values β = 0.99 and α = 0.01... We list the ranges of hyperparameter grids in Table 6. We report additional information on model-specific hyperparameters in Table 7... |