Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Diffusion Counterfactual Generation with Semantic Abduction
Authors: Rajat R Rasal, Avinash Kori, Fabio De Sousa Ribeiro, Tian Xia, Ben Glocker
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present three case studies using our mechanisms for counterfactual image generation 1. We begin with a toy scenario where we control the true causal data-generating process, and progressively scale up our mechanisms for causal face modelling and a novel medical artefact removal problem. We compare our mechanisms against VAE, HVAE and diffusion-based alternatives (Pawlowski et al., 2020; De Sousa Ribeiro et al., 2023; Wu et al., 2024; Sanchez & Tsaftaris, 2022) using counterfactual soundness metrics. Table 1 reports the counterfactual soundness results for simple DSCMs modelling only the mechanism d x, assessed under random interventions do(d). |
| Researcher Affiliation | Academia | 1Department of Computing, Imperial College London, UK. |
| Pseudocode | Yes | Algorithm 1 Counterfactual Trajectory Alignment |
| Open Source Code | Yes | We present three case studies using our mechanisms for counterfactual image generation 1. 1https://github.com/Rajat Rasal/Diffusion-Counterfactuals |
| Open Datasets | Yes | Morpho-MNIST (Castro et al., 2019) dataset... Celeb A-HQ (Karras, 2017)... EMory Br East imaging Dataset (EMBED) (Jeong et al., 2022). |
| Dataset Splits | Yes | Table 4. Network architecture of our semantic mechanisms. PARAMETER MORPHO-MNIST CMORPHO-MNIST CELEBA CELEBA-HQ EMBED TRAINING SET 50000 50000 162770 24000 13207 VALIDATION SET 10000 10000 19867 3000 3300 TEST SET 10000 10000 19962 3000 5503 |
| Hardware Specification | Yes | dynamic semantic abduction requires 3 minutes per image, compared to 3 minutes and 3.5 minutes for the guided spatial and semantic mechanisms, respectively, using a batch size of 128 on an NVIDIA Ge Force RTX 4090. |
| Software Dependencies | No | The paper mentions "from torch import nn" and "from torchvision.models import resnet50", indicating the use of PyTorch and torchvision. It also mentions "Adam (Kingma, 2014) optimiser". However, no specific version numbers for these software components are provided, which is necessary for reproducibility. |
| Experiment Setup | Yes | Table 4. Network architecture of our semantic mechanisms. BATCH SIZE 128 EPOCHS 1000 LEARNING RATE 1e-4 OPTIMISER ADAM (NO WEIGHT DECAY) EMA DECAY FACTOR 0.9999 TRAINING T 1000 DIFFUSION LOSS MSE WITH NOISE PREDICTION |