Residual Similarity Based Conditional Independence Test and Its Application in Causal Discovery

Authors: Hao Zhang, Shuigeng Zhou, Kun Zhang, Jihong Guan5942-5949

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental When applied to causal discovery, the proposed method outperforms the counterparts in terms of both speed and Type II error rate, especially in the case of small sample size, which is validated by our extensive experiments on various datasets.
Researcher Affiliation Academia 1Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, China 2School of Computer, Guangdong University of Petrochemical Technology, China 3Department of Philosophy, Carnegie Mellon University, USA 4Machine Learning Department, Mohamed bin Zayed University of Artificial Intelligence, UAE 5Department of Computer Science & Technology, Tongji University, China
Pseudocode Yes Algorithm 1: Similarity based conditional independence test (SCIT)
Open Source Code Yes The source code of SCIT package is available at https://github.com/Causality-Inference/SCIT.
Open Datasets Yes To obtain the precise ground truth in every cases, the corresponding data-generating process follows the previous works (Cai, Zhang, and Hao 2013, 2017).
Dataset Splits No The paper does not explicitly provide details about training/validation/test splits, only mentions sample sizes and testing.
Hardware Specification Yes The experimental platform adopts Matlab R2021b, Intel i7-11700K (3.60 GHz) CPU, Windows 10, and 32G memory.
Software Dependencies Yes The experimental platform adopts Matlab R2021b, Intel i7-11700K (3.60 GHz) CPU, Windows 10, and 32G memory.
Experiment Setup Yes The significance levels are fixed at α = 0.05. Note that for a good testing method, the probability of Type I error should be as close to the significance level as possible, and the probability of Type II error should be as small as possible. We check how the errors change when increasing the dimensionality of Z and the sample size n. For each parameter setting, we randomly repeat the testing 100 times and average their results.