On Measuring Causal Contributions via do-interventions
Authors: Yonghan Jung, Shiva Kasiviswanathan, Jin Tian, Dominik Janzing, Patrick Bloebaum, Elias Bareinboim
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6. Experiments In this section, we empirically compare the performance of the proposed do-Shapley estimators from the previous section. ... Experimental Setup. We use synthetic datasets based on Figs. (2a, 2b, 2c) where each figures matches with Thm. 2, Markovian, and Direct-cause cases. ... We assess the quality of the estimator by computing the L1 error as L1(est, k) := (1/n) Pn i=1 ϕest vi,k ϕvi,k ... Experimental Results. The L1-error plots for all cases are presented in Fig. 3. |
| Researcher Affiliation | Collaboration | Yonghan Jung 1 Shiva Prasad Kasiviswanathan 2 Jin Tian 3 Dominik Janzing 2 Patrick Blöbaum 2 Elias Bareinboim 4 1Purdue University 2Amazon 3Iowa State Univerity 4Columbia University. |
| Pseudocode | Yes | Algorithm 1 do-Shapley(M, T est( )) |
| Open Source Code | No | The paper mentions: "We also recommend checking the code data_generator_1.py, data_generator_2.py for the detailed configurations of the data generating processes." However, this refers to code for data generation, not the open-source code for the do-Shapley methodology or its estimators described in the paper. |
| Open Datasets | Yes | Experimental Setup. We use synthetic datasets based on Figs. (2a, 2b, 2c) where each figures matches with Thm. 2, Markovian, and Direct-cause cases. ... Details of the data generating process are provided in Appendix E. ... We also recommend checking the code data_generator_1.py, data_generator_2.py for the detailed configurations of the data generating processes. |
| Dataset Splits | Yes | 1. Split D randomly into two halves: D0 and D1; ... The data-splitting (also known as sample-splitting) technique (Klaassen, 1987; Robins & Ritov, 1997; Robins et al., 2008; Zheng & van der Laan, 2011; Chernozhukov et al., 2018) will be employed in constructing all do-Shapley estimators discussed in this section. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory, cloud instances) used for the experiments are mentioned in the paper. |
| Software Dependencies | No | Nuisances are estimated using gradient boosting model (Friedman, 2001). ... For all estimators, nuisances are estimated using gradient boosting model called XGBoost (Chen & Guestrin, 2016). No specific version numbers for these or other software dependencies are provided. |
| Experiment Setup | Yes | We fix M = 20. ... We ran the simulation for 50 randomly generated sets of samples; i.e., k {1, 2, , 50}, and with sample size N := |D| {100, 250, 500, 750, 1000} to observe convergence behaviors of estimators. ... (2) Noisy where a converging noise ϵ, decaying at a N α rate (i.e., ϵ Normal(N α, N 2α)) for α = 1/4, is added to the estimated nuisance to control the convergence rate, following the technique in (Kennedy, 2020). |