Causal Inference Through the Structural Causal Marginal Problem
Authors: Luigi Gresele, Julius Von Kügelgen, Jonas Kübler, Elke Kirschbaum, Bernhard Schölkopf, Dominik Janzing
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4. Experiments In Fig. 3 (a) we visualise the worked-out example from 3.3. Recall that there the interventional Y Z model uniquely determines MY , and Λ0 is therefore a segment with λX [0, 0.5] and y-coordinate fixed by θ = P(Y = 1). We take 20 linearly-spaced θ (0.5, 1) and plot both Λ0 (thick red segments) and the reduced range [LB X, UB X] (superimposed, thin blue lines). Decreasing θ from 1 (top line at λY = 0.5) to 0.5 (bottom line at λY = 0) yields an increase in LB X, thereby restricting the range of allowed MX models and fully specifying it for θ = 0.5 when Z := Y . Extending the analytical treatment of 3.3 to more general settings is nontrivial. To characterise the entailed constraints in generic settings, we therefore resort to numerical simulations (see Appx. G for all technical details): we generate random instances of consistent PXZ and PY Z, compute the space of solutions ΛC, and compare it to Λ0. A specific instance is shown in Fig. 3 (b); see [GIF1] [GIF2] for additional visualisations, where we fix a conditional PZ|XY and plot ΛC and Λ0 for different choices of PX, PY .11 The parameters used to generate Fig. 3 (b) violate some of the restrictive assumptions of 3.3 (most notably X Z and P(Y = 0, Z = 1) = 0), and show that the schematic visualisation in Fig. 2 captures some aspects of the general case: the structural causal marginal problem can yield constraints for both marginal SCMs MX and MY , and Λ0 and ΛC are different. Moreover, we see that (λmax X , λmax Y ) ΛC, consistent with Prop. 4. In Fig. 3 (c), we plot the cumulative distribution functions (CDFs) of the ratios between the blue and red areas (i.e., (UBX LBX)(UBY LBY )/|Λ0|) in blue, and the ratio between the green and red areas (i.e., |ΛC|/|Λ0|) in green. The CDFs are estimated based on 1,000 independent samples of P(Z = 1|X = i, Y = j) Beta(α, β) for i, j {0, 1} and P(X = 1), P(Y = 1) U[0, 1].11 We compare two scenarios: α = β = 1, i.e., a Uniform prior, shown as solid lines; and α = β = 0.5, leading to more deterministic conditionals, shown as dashed lines. Across both scenarios, a reduction (i.e., ratios smaller than one) can be observed at least 30% of the time. Whereas many times there is no or only a small reduction, we also sometimes (with positive probability) observe quite substantial reductions of 50+%. Moreover, we find that α = β = 0.5 leads to larger reductions, suggesting that more deterministic (joint) conditionals may impose stronger constraints. Finally, we remark that (λmax X , λmax Y ) ΛC indeed holds across all runs. |
| Researcher Affiliation | Collaboration | *Equal contribution 1Max Planck Institute for Intelligent Systems, T ubingen, Germany 2University of Cambridge, Cambridge, United Kingdom 3Amazon Research, T ubingen, Germany. Correspondence to: Luigi Gresele <luigi.gresele@tue.mpg.de>. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Software and Data Code is available at https://github.com/lgresele/structural-causal-marginal. |
| Open Datasets | No | To characterise the entailed constraints in generic settings, we therefore resort to numerical simulations (see Appx. G for all technical details): we generate random instances of consistent PXZ and PY Z, compute the space of solutions ΛC, and compare it to Λ0. |
| Dataset Splits | No | The paper describes numerical simulations and theoretical analysis of causal models, not the training and evaluation of models using standard train/validation/test splits from a dataset. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | Yes | We use the pypoman package (Caron, 2018), to compute those vertices of the polyhedron. After that we can project each vertex into the (λX, λY )-plane. ... which we compute using scipy (Virtanen et al., 2020). |
| Experiment Setup | Yes | We take 20 linearly-spaced θ (0.5, 1) and plot both Λ0 (thick red segments) and the reduced range [LB X, UB X] (superimposed, thin blue lines). The CDFs are estimated based on 1,000 independent samples of P(Z = 1|X = i, Y = j) Beta(α, β) for i, j {0, 1} and P(X = 1), P(Y = 1) U[0, 1]. |