reproducibilityindex.ai

Diffusion Models for Causal Discovery via Topological Ordering

Authors: Pedro Sanchez, Xiao Liu, Alison Q O'Neil, Sotirios A. Tsaftaris

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirically that our method scales exceptionally well to datasets with up to 500 nodes and up to 105 samples while still performing on par over small datasets with state-of-the-art causal discovery methods.
Researcher Affiliation	Collaboration	1The University of Edinburgh 2Canon Medical Research Europe 3The Alan Turing Institute
Pseudocode	Yes	Algorithm 1: Topological Ordering with Diff AN
Open Source Code	Yes	pedro.sanchez@ed.ac.uk 1Implementation is available at https://github.com/vios-s/Diff AN .
Open Datasets	Yes	We consider two real datasets: (i) Sachs: A protein signaling network based on expression levels of proteins and phospholipids (Sachs et al., 2005). We consider only the observational data (n = 853 samples) since our method targets discovery of causal mechanisms when only observational data is available. The ground truth causal graph given by Sachs et al. (2005) has 11 nodes and 17 edges. (ii) Syn TRe N: We also evaluate the models on a pseudo-real dataset sampled from Syn TRe N generator (Van den Bulcke et al., 2006).
Dataset Splits	No	The paper mentions training data and a subsample but does not explicitly provide percentages or counts for train/validation/test splits, nor does it refer to a standard split by citation for the synthetic data. For real data, it states 'n = 853 samples' for Sachs but no split information.
Hardware Specification	No	The paper mentions "64GB of RAM" in the context of comparing its method's scalability to SCORE, indicating a limitation of other methods on a specific machine. However, it does not specify the hardware (CPU, GPU, specific RAM configuration) used for its own experiments.
Software Dependencies	No	The paper describes the neural network architecture (MLP with Linear layers, Leaky ReLU, Layer Norm, Dropout) and mentions using "functorch" for auto-differentiation, but it does not specify version numbers for any software, libraries (like PyTorch or TensorFlow), or programming languages.
Experiment Setup	Yes	D.1 HYPERPARAMETERS OF DPM TRAINING: We use number of time steps T = 100, βt is a linearly scheduled between βmin = 0.0001 and βmax = 0.02. The model is trained according to Equation 3 which follows Ho et al. (2020). During sampling, t is sampled from a Uniform distribution. D.2 NEURAL ARCHITECTURE: The neural network follows a simple MLP with 5 Linear layers, Leaky Re LU activation function, Layer Normalization and Dropout in the first layer. The full architecture is detailed in Table 1.