High Fidelity Image Counterfactuals with Probabilistic Causal Models
Authors: Fabio De Sousa Ribeiro, Tian Xia, Miguel Monteiro, Nick Pawlowski, Ben Glocker
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that our proposed mechanisms are capable of accurate abduction and estimation of direct, indirect and total effects as measured by axiomatic soundness of counterfactuals. We present 3 case studies on counterfactual inference of high-dimensional structured variables. To quantitatively evaluate our deep SCMs, we measure effectiveness and composition, which are axiomatic properties of counterfactuals that hold true in all causal models (Pearl, 2009; Monteiro et al., 2023). |
| Researcher Affiliation | Collaboration | 1Imperial College London 2Microsoft Research Cambridge, UK. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It describes methods in prose. |
| Open Source Code | Yes | We present 3 case studies on counterfactual inference of high-dimensional structured variables1. 1https://github.com/biomedia-mira/causal-gen |
| Open Datasets | Yes | For our Morpho-MNIST experiments, we construct a similar scenario to Pawlowski et al. (2020) using the Morpho-MNIST (Castro et al., 2019) dataset. We randomly split the full dataset into subsets of 19466 training, 3500 validation and 3500 test samples. We further extend the proposed approach to the MIMICCXR dataset (Johnson et al., 2019). Finally, we split the dataset into 62,336 subjects for training, 9,968 for validation and 30,535 for testing. |
| Dataset Splits | Yes | We randomly split the full dataset into subsets of 19466 training, 3500 validation and 3500 test samples. We further ensure no overlapping subjects between the training and evaluation datasets exist. Finally, we split the dataset into 62,336 subjects for training, 9,968 for validation and 30,535 for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions "Pyro and Pytorch" and "Torchvision", but does not specify their version numbers. |
| Experiment Setup | Yes | We trained our HVAEs for 1M steps using a batch size of 32 and the Adam W optimizer (Loshchilov & Hutter, 2017). We used an initial learning rate of 1e-3 with 100 linear warmup steps, β1 = 0.9, β2 = 0.9 and a weight decay of 0.01. We set gradient clipping to 350 and set a gradient update skipping threshold of 500 (based on L2 norm). For data-augmentation, we applied zero-padding of 4 on all borders and random cropped to 32 32 resolution. Pixel intensities we rescaled to [ 1, 1] for and validation/test images were zero-padded to 32 32. |