Recursively Learning Causal Structures Using Regression-Based Conditional Independence Test
Authors: Hao Zhang, Shuigeng Zhou, Chuanxu Yan, Jihong Guan, Xin Wang3108-3115
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that the proposed method can not only substantially reduce redundant CI tests but also effectively distinguish the equivalence classes, thus is superior to the state of the art constraint-based methods in causality inference.Performance Evaluation We first compare CAPA with one of the latest recursive learning methods SADA (Cai, Zhang, and Hao 2017) by extensive simulated experiments for evaluating their abilities of finding causal partitioning and learning causal directions. To further illustrate the advantage of CAPA in causal structure learning, we also compare CAPA with four other existing causal learning methods, including Li NGAM (Shimizu et al. 2006), DLi NGAM (Shimizu et al. 2011), Sparse-ICA Li NGAM (Zhang et al. 2009) and SADA-Li NGAM (Cai, Zhang, and Hao 2017), over various real-world causal structures. Note that all these four methods can distinguish Markov equivalence classes, in which SADA-Li NGAM stands for the state of the art in high-dimensional cases. Performance on simulated structures. In this group of experiments, we evaluate our method on datasets generated by simulated causal network structures, under the linear non Gaussian model. From Fig. 3(a), we can see that CAPA performs better than SADA in terms of precision and F1 on causal skeleton learning for different sample sizes. |
| Researcher Affiliation | Academia | 1Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, China. 2Department of Computer Science & Technology, Tongji University, China 3University of Calgary, Canada, and Northwest University, China 1{haoz15, sgzhou,17110240047}@fudan.edu.cn; 2jhguan@tongji.edu.cn; 3xcwang@ucalgary.ca |
| Pseudocode | Yes | Algorithm 1 CAPA 1: Input: The original variable set V, algorithm Ag. 2: Output: The causal graph G. 3: Find a causal partitioning {V1, V2, V3} on V. 4: if max{|V1|, |V2|, |V3|} = |V| then 5: Return G by running algorithm Ag on V. 6: else 7: G1=CAPA(V1, Ag, δ), 8: G2=CAPA(V2, Ag, δ), 9: G3=CAPA(V3, Ag, δ). 10: Return G by merging G1, G2 and G3. 11: end if Algorithm 2 Finding Causal Partitioning 1: Input: V: The input variable set; σ: The threshold to limit the maximum order of CI table; M: The CI table w.r.t. V, initializes M=zeros(|V|, |V|). 2: Output: The causal partitioning V=(V1, V2, V3). 3: vi, vj V, set Mi j = 1 in case vi v j. 4: Divide V into three non-overlapping parts V = {A, B,C = V\A,B} by solving the optimization problem: min |C| s.t. ( vi A, v j B, Mi j = 1 |A| > 0, |B| > 0 5: for vi V1 V2 do 6: Remove vi from V if vi and vj C satisfy Mi j = 1; 7: end for 8: Let V1=A C , V2=B C and V3=V. 9: if max{|V1|, |V2|, |V3|} = |V| and M s order k σ then 10: for vi, v j V (Mi,j = 0) do 11: Set Mi,j = 1 in case Z V\vi,v j (|Z| = k + 1) such that vi vj|Z. 12: end for 13: Goto line 4. 14: end if 15: Return V={V1, V2, V3}. |
| Open Source Code | No | No explicit statement or link to open-source code for the described methodology. |
| Open Datasets | No | The paper mentions datasets like 'simulated causal network structures' and 'eight real-world causal network structures', and refers to a URL for 'bnlearn.com/bnrepository/', but does not provide specific access details (citation with authors/year, direct link to the dataset download, or clear indication of public availability for all datasets used). |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits with specific percentages or counts. It mentions 'different sample sizes {25, 50, 100, 200, 400}' and '400 samples' but not how these were split for train/validation/test. |
| Hardware Specification | No | No specific hardware details (GPU/CPU models, memory, etc.) are mentioned for running the experiments. |
| Software Dependencies | No | The paper mentions using 'PC algorithm (Spirtes, Glymour, and Scheines 2000)' as a basic algorithm and refers to implementations of 'Li NGAM, DLi NGAM and SADA-Li NGAM' but does not specify any software names with version numbers. |
| Experiment Setup | Yes | To save time, we terminate the causal partitioning process when the regarding subset is smaller than |V|/10 and limit the maximum size of controlling set at 3. |