Causal Inference Through the Structural Causal Marginal Problem

Authors: Luigi Gresele, Julius Von Kügelgen, Jonas Kübler, Elke Kirschbaum, Bernhard Schölkopf, Dominik Janzing

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4. Experiments In Fig. 3 (a) we visualise the worked-out example from 3.3. Recall that there the interventional Y Z model uniquely determines MY , and Λ0 is therefore a segment with λX [0, 0.5] and y-coordinate fixed by θ = P(Y = 1). We take 20 linearly-spaced θ (0.5, 1) and plot both Λ0 (thick red segments) and the reduced range [LB X, UB X] (superimposed, thin blue lines). Decreasing θ from 1 (top line at λY = 0.5) to 0.5 (bottom line at λY = 0) yields an increase in LB X, thereby restricting the range of allowed MX models and fully specifying it for θ = 0.5 when Z := Y . Extending the analytical treatment of 3.3 to more general settings is nontrivial. To characterise the entailed constraints in generic settings, we therefore resort to numerical simulations (see Appx. G for all technical details): we generate random instances of consistent PXZ and PY Z, compute the space of solutions ΛC, and compare it to Λ0. A specific instance is shown in Fig. 3 (b); see [GIF1] [GIF2] for additional visualisations, where we fix a conditional PZ|XY and plot ΛC and Λ0 for different choices of PX, PY .11 The parameters used to generate Fig. 3 (b) violate some of the restrictive assumptions of 3.3 (most notably X Z and P(Y = 0, Z = 1) = 0), and show that the schematic visualisation in Fig. 2 captures some aspects of the general case: the structural causal marginal problem can yield constraints for both marginal SCMs MX and MY , and Λ0 and ΛC are different. Moreover, we see that (λmax X , λmax Y ) ΛC, consistent with Prop. 4. In Fig. 3 (c), we plot the cumulative distribution functions (CDFs) of the ratios between the blue and red areas (i.e., (UBX LBX)(UBY LBY )/|Λ0|) in blue, and the ratio between the green and red areas (i.e., |ΛC|/|Λ0|) in green. The CDFs are estimated based on 1,000 independent samples of P(Z = 1|X = i, Y = j) Beta(α, β) for i, j {0, 1} and P(X = 1), P(Y = 1) U[0, 1].11 We compare two scenarios: α = β = 1, i.e., a Uniform prior, shown as solid lines; and α = β = 0.5, leading to more deterministic conditionals, shown as dashed lines. Across both scenarios, a reduction (i.e., ratios smaller than one) can be observed at least 30% of the time. Whereas many times there is no or only a small reduction, we also sometimes (with positive probability) observe quite substantial reductions of 50+%. Moreover, we find that α = β = 0.5 leads to larger reductions, suggesting that more deterministic (joint) conditionals may impose stronger constraints. Finally, we remark that (λmax X , λmax Y ) ΛC indeed holds across all runs.
Researcher Affiliation Collaboration *Equal contribution 1Max Planck Institute for Intelligent Systems, T ubingen, Germany 2University of Cambridge, Cambridge, United Kingdom 3Amazon Research, T ubingen, Germany. Correspondence to: Luigi Gresele <luigi.gresele@tue.mpg.de>.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Software and Data Code is available at https://github.com/lgresele/structural-causal-marginal.
Open Datasets No To characterise the entailed constraints in generic settings, we therefore resort to numerical simulations (see Appx. G for all technical details): we generate random instances of consistent PXZ and PY Z, compute the space of solutions ΛC, and compare it to Λ0.
Dataset Splits No The paper describes numerical simulations and theoretical analysis of causal models, not the training and evaluation of models using standard train/validation/test splits from a dataset.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies Yes We use the pypoman package (Caron, 2018), to compute those vertices of the polyhedron. After that we can project each vertex into the (λX, λY )-plane. ... which we compute using scipy (Virtanen et al., 2020).
Experiment Setup Yes We take 20 linearly-spaced θ (0.5, 1) and plot both Λ0 (thick red segments) and the reduced range [LB X, UB X] (superimposed, thin blue lines). The CDFs are estimated based on 1,000 independent samples of P(Z = 1|X = i, Y = j) Beta(α, β) for i, j {0, 1} and P(X = 1), P(Y = 1) U[0, 1].