Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Causal Transformer for Estimating Counterfactual Outcomes
Authors: Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our Causal Transformer based on synthetic and real-world datasets, where it achieves superior performance over current baselines. To the best of our knowledge, this is the first work proposing transformer-based architecture for estimating counterfactual outcomes from longitudinal data. |
| Researcher Affiliation | Academia | 1 LMU Munich, Munich, Germany. Correspondence to: Valentyn Melnychuk <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Adversarial training in CT via iterative gradient descent |
| Open Source Code | Yes | 1 Code is available online: https://github.com/ Valentyn1997/Causal Transformer |
| Open Datasets | Yes | For this, we use the MIMIC-III dataset (Johnson et al., 2016). For each level of confounding γ, we simulate 10,000 patient trajectories for training, 1,000 for validation, and 1,000 trajectories for testing. |
| Dataset Splits | Yes | For each level of confounding γ, we simulate 10,000 patient trajectories for training, 1,000 for validation, and 1,000 trajectories for testing. we split the cohort of 1,000 patients into train/validation/test subsets via a 60% / 20% / 20 % split. We then split the cohort of 5,000 patients with a ratio of 70%/15%/15% into train/validation/test subsets. |
| Hardware Specification | Yes | Experiments are carried out on 1 TITAN V GPU. |
| Software Dependencies | No | We implemented CT in Py Torch Lightning. We trained CT using Adam (Kingma & Ba, 2015)... |
| Experiment Setup | Yes | We trained CT using Adam (Kingma & Ba, 2015) with learning rate η and number of epochs ne. The dropout rate p was kept the same... For the parameters α and β of adversarial training, we choose values β = 0.99 and α = 0.01... We list the ranges of hyperparameter grids in Table 6. We report additional information on model-specific hyperparameters in Table 7... |