Causal Structure Recovery with Latent Variables under Milder Distributional and Graphical Assumptions

Authors: Xiu-Chuan Li, Kun Zhang, Tongliang Liu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our proposed algorithms to both synthetic and real-world data to demonstrate their effectiveness. Due to the space limit, we only present experimental results on synthetic data derived by causal models with structure G1 and G2 as shown in Figure 1 in the main text and provide more details in Appendix A. For each graph, we draw 10 sample sets of size 𝑁=500, 1000, 2000 respectively. Each causal strength is sampled from a uniform distribution over [ 2.0, 0.5] [0.5, 2.0]. Noises of causal models with structure G1 are Gaussian variables with mean 0 and standard error drawn from uniform(0.5,1), those of causal models with structure G2 are the seventh power of uniform(-1,1) variables, which are then normalized to have a standard error also drawn from uniform(0.5,1). We compare our proposed methods with BPC (Silva et al., 2006), FOFC (Kummerfeld & Ramsey, 2016), and GIN (Xie et al., 2020). We use Latent Omission (LO), Latent Commission (LC), Wrong Indicator (WI) as the evaluation metrics.
Researcher Affiliation Academia Xiu-Chuan Li1 Kun Zhang2,3 Tongliang Liu1 1Sydney AI Centre, University of Sydney 2Carnegie Mellon University 3Mohamed bin Zayed University of Artificial Intelligence
Pseudocode Yes Algorithm 1: Partially identifying latent variables. Algorithm 2: Fully identifying latent variables in Case I. Algorithm 3: Fully identifying latent variables in Case II. Algorithm 4: PC-MIMBuild
Open Source Code No The paper does not provide a direct link or explicit statement about the availability of its own source code. It mentions 'its implementation in TETRAD (Ramsey et al., 2018)' but this refers to a third-party software, not the authors' own code.
Open Datasets Yes We apply our proposed algorithms to both synthetic and real-world data to demonstrate their effectiveness. Due to the space limit, we only present experimental results on synthetic data derived by causal models with structure G1 and G2 as shown in Figure 1 in the main text and provide more details in Appendix A. For each graph, we draw 10 sample sets of size 𝑁=500, 1000, 2000 respectively. Holzinger Swineford1939 (HS1939) (Rosseel, 2012) and Teacher Burnout (TB) (Byrne, 2013).
Dataset Splits No The paper mentions drawing '10 sample sets of size 𝑁=500, 1000, 2000 respectively' for synthetic data but does not specify how these samples were split into training, validation, and test sets, or if any cross-validation was used.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers for reproducing the experiments.
Experiment Setup No The paper describes how synthetic data was generated ('Each causal strength is sampled from a uniform distribution over [ 2.0, 0.5] [0.5, 2.0]. Noises of causal models with structure G1 are Gaussian variables with mean 0 and standard error drawn from uniform(0.5,1), those of causal models with structure G2 are the seventh power of uniform(-1,1) variables, which are then normalized to have a standard error also drawn from uniform(0.5,1).') but does not provide specific hyperparameter values or training configurations for their algorithms.