reproducibilityindex.ai

Causal Structure Recovery with Latent Variables under Milder Distributional and Graphical Assumptions

Authors: Xiu-Chuan Li, Kun Zhang, Tongliang Liu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply our proposed algorithms to both synthetic and real-world data to demonstrate their effectiveness. Due to the space limit, we only present experimental results on synthetic data derived by causal models with structure G1 and G2 as shown in Figure 1 in the main text and provide more details in Appendix A. For each graph, we draw 10 sample sets of size 𝑁=500, 1000, 2000 respectively. Each causal strength is sampled from a uniform distribution over [ 2.0, 0.5] [0.5, 2.0]. Noises of causal models with structure G1 are Gaussian variables with mean 0 and standard error drawn from uniform(0.5,1), those of causal models with structure G2 are the seventh power of uniform(-1,1) variables, which are then normalized to have a standard error also drawn from uniform(0.5,1). We compare our proposed methods with BPC (Silva et al., 2006), FOFC (Kummerfeld & Ramsey, 2016), and GIN (Xie et al., 2020). We use Latent Omission (LO), Latent Commission (LC), Wrong Indicator (WI) as the evaluation metrics.
Researcher Affiliation	Academia	Xiu-Chuan Li1 Kun Zhang2,3 Tongliang Liu1 1Sydney AI Centre, University of Sydney 2Carnegie Mellon University 3Mohamed bin Zayed University of Artificial Intelligence
Pseudocode	Yes	Algorithm 1: Partially identifying latent variables. Algorithm 2: Fully identifying latent variables in Case I. Algorithm 3: Fully identifying latent variables in Case II. Algorithm 4: PC-MIMBuild
Open Source Code	No	The paper does not provide a direct link or explicit statement about the availability of its own source code. It mentions 'its implementation in TETRAD (Ramsey et al., 2018)' but this refers to a third-party software, not the authors' own code.
Open Datasets	Yes	We apply our proposed algorithms to both synthetic and real-world data to demonstrate their effectiveness. Due to the space limit, we only present experimental results on synthetic data derived by causal models with structure G1 and G2 as shown in Figure 1 in the main text and provide more details in Appendix A. For each graph, we draw 10 sample sets of size 𝑁=500, 1000, 2000 respectively. Holzinger Swineford1939 (HS1939) (Rosseel, 2012) and Teacher Burnout (TB) (Byrne, 2013).
Dataset Splits	No	The paper mentions drawing '10 sample sets of size 𝑁=500, 1000, 2000 respectively' for synthetic data but does not specify how these samples were split into training, validation, and test sets, or if any cross-validation was used.
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers for reproducing the experiments.
Experiment Setup	No	The paper describes how synthetic data was generated ('Each causal strength is sampled from a uniform distribution over [ 2.0, 0.5] [0.5, 2.0]. Noises of causal models with structure G1 are Gaussian variables with mean 0 and standard error drawn from uniform(0.5,1), those of causal models with structure G2 are the seventh power of uniform(-1,1) variables, which are then normalized to have a standard error also drawn from uniform(0.5,1).') but does not provide specific hyperparameter values or training configurations for their algorithms.