Causal Structure Recovery with Latent Variables under Milder Distributional and Graphical Assumptions
Authors: Xiu-Chuan Li, Kun Zhang, Tongliang Liu
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply our proposed algorithms to both synthetic and real-world data to demonstrate their effectiveness. Due to the space limit, we only present experimental results on synthetic data derived by causal models with structure G1 and G2 as shown in Figure 1 in the main text and provide more details in Appendix A. For each graph, we draw 10 sample sets of size 𝑁=500, 1000, 2000 respectively. Each causal strength is sampled from a uniform distribution over [ 2.0, 0.5] [0.5, 2.0]. Noises of causal models with structure G1 are Gaussian variables with mean 0 and standard error drawn from uniform(0.5,1), those of causal models with structure G2 are the seventh power of uniform(-1,1) variables, which are then normalized to have a standard error also drawn from uniform(0.5,1). We compare our proposed methods with BPC (Silva et al., 2006), FOFC (Kummerfeld & Ramsey, 2016), and GIN (Xie et al., 2020). We use Latent Omission (LO), Latent Commission (LC), Wrong Indicator (WI) as the evaluation metrics. |
| Researcher Affiliation | Academia | Xiu-Chuan Li1 Kun Zhang2,3 Tongliang Liu1 1Sydney AI Centre, University of Sydney 2Carnegie Mellon University 3Mohamed bin Zayed University of Artificial Intelligence |
| Pseudocode | Yes | Algorithm 1: Partially identifying latent variables. Algorithm 2: Fully identifying latent variables in Case I. Algorithm 3: Fully identifying latent variables in Case II. Algorithm 4: PC-MIMBuild |
| Open Source Code | No | The paper does not provide a direct link or explicit statement about the availability of its own source code. It mentions 'its implementation in TETRAD (Ramsey et al., 2018)' but this refers to a third-party software, not the authors' own code. |
| Open Datasets | Yes | We apply our proposed algorithms to both synthetic and real-world data to demonstrate their effectiveness. Due to the space limit, we only present experimental results on synthetic data derived by causal models with structure G1 and G2 as shown in Figure 1 in the main text and provide more details in Appendix A. For each graph, we draw 10 sample sets of size 𝑁=500, 1000, 2000 respectively. Holzinger Swineford1939 (HS1939) (Rosseel, 2012) and Teacher Burnout (TB) (Byrne, 2013). |
| Dataset Splits | No | The paper mentions drawing '10 sample sets of size 𝑁=500, 1000, 2000 respectively' for synthetic data but does not specify how these samples were split into training, validation, and test sets, or if any cross-validation was used. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers for reproducing the experiments. |
| Experiment Setup | No | The paper describes how synthetic data was generated ('Each causal strength is sampled from a uniform distribution over [ 2.0, 0.5] [0.5, 2.0]. Noises of causal models with structure G1 are Gaussian variables with mean 0 and standard error drawn from uniform(0.5,1), those of causal models with structure G2 are the seventh power of uniform(-1,1) variables, which are then normalized to have a standard error also drawn from uniform(0.5,1).') but does not provide specific hyperparameter values or training configurations for their algorithms. |