On Measuring Causal Contributions via do-interventions

Authors: Yonghan Jung, Shiva Kasiviswanathan, Jin Tian, Dominik Janzing, Patrick Bloebaum, Elias Bareinboim

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6. Experiments In this section, we empirically compare the performance of the proposed do-Shapley estimators from the previous section. ... Experimental Setup. We use synthetic datasets based on Figs. (2a, 2b, 2c) where each figures matches with Thm. 2, Markovian, and Direct-cause cases. ... We assess the quality of the estimator by computing the L1 error as L1(est, k) := (1/n) Pn i=1 ϕest vi,k ϕvi,k ... Experimental Results. The L1-error plots for all cases are presented in Fig. 3.
Researcher Affiliation Collaboration Yonghan Jung 1 Shiva Prasad Kasiviswanathan 2 Jin Tian 3 Dominik Janzing 2 Patrick Blöbaum 2 Elias Bareinboim 4 1Purdue University 2Amazon 3Iowa State Univerity 4Columbia University.
Pseudocode Yes Algorithm 1 do-Shapley(M, T est( ))
Open Source Code No The paper mentions: "We also recommend checking the code data_generator_1.py, data_generator_2.py for the detailed configurations of the data generating processes." However, this refers to code for data generation, not the open-source code for the do-Shapley methodology or its estimators described in the paper.
Open Datasets Yes Experimental Setup. We use synthetic datasets based on Figs. (2a, 2b, 2c) where each figures matches with Thm. 2, Markovian, and Direct-cause cases. ... Details of the data generating process are provided in Appendix E. ... We also recommend checking the code data_generator_1.py, data_generator_2.py for the detailed configurations of the data generating processes.
Dataset Splits Yes 1. Split D randomly into two halves: D0 and D1; ... The data-splitting (also known as sample-splitting) technique (Klaassen, 1987; Robins & Ritov, 1997; Robins et al., 2008; Zheng & van der Laan, 2011; Chernozhukov et al., 2018) will be employed in constructing all do-Shapley estimators discussed in this section.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory, cloud instances) used for the experiments are mentioned in the paper.
Software Dependencies No Nuisances are estimated using gradient boosting model (Friedman, 2001). ... For all estimators, nuisances are estimated using gradient boosting model called XGBoost (Chen & Guestrin, 2016). No specific version numbers for these or other software dependencies are provided.
Experiment Setup Yes We fix M = 20. ... We ran the simulation for 50 randomly generated sets of samples; i.e., k {1, 2, , 50}, and with sample size N := |D| {100, 250, 500, 750, 1000} to observe convergence behaviors of estimators. ... (2) Noisy where a converging noise ϵ, decaying at a N α rate (i.e., ϵ Normal(N α, N 2α)) for α = 1/4, is added to the estimated nuisance to control the convergence rate, following the technique in (Kennedy, 2020).