Latent Hierarchical Causal Structure Discovery with Rank Constraints

Authors: Biwei Huang, Charles Jia Han Low, Feng Xie, Clark Glymour, Kun Zhang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose an estimation procedure that can efficiently locate latent variables, determine their cardinalities, and identify the latent hierarchical structure, by leveraging rank deficiency constraints over the measured variables. We show that the proposed algorithm can find the correct Markov equivalence class of the whole graph asymptotically under proper restrictions on the graph structure. ... In Section 5, we empirically validate the proposed approach on synthetic data.
Researcher Affiliation Academia 1 Carnegie Mellon University 2 Mohamed bin Zayed University of Artificial Intelligence 3 Beijing Technology and Business University, China
Pseudocode Yes Algorithm 1: Latent Hierarchical Causal Structure Discovery; Algorithm 2: Phase I: find Causal Clusters; Algorithm 3: Phase II: refine Clusters; Algorithm 4: Phase III: refine Edges
Open Source Code No Code will be made public.
Open Datasets No The paper uses "synthetic data" generated by the authors, which is not a publicly available dataset with concrete access information provided.
Dataset Splits No The paper mentions "synthetic data" and "sample sizes" (2k, 5k, 10k) but does not specify explicit training, validation, or test dataset splits (e.g., percentages or exact counts for each split).
Hardware Specification Yes All experiments were performed on a server equipped with an Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz, NVIDIA Quadro RTX 6000 GPU and 256GB RAM.
Software Dependencies No All our code is implemented in Python and we used standard libraries such as numpy, scipy, and scikit-learn for basic operations.
Experiment Setup Yes The causal strength was generated uniformly from [-5, -0.5] U [0.5, 5], and the noise term was randomly chosen from a normal distribution with noise variance uniformly sampled from [1, 5]. We generated 100 random IL2H graphs for each setting... We simulated N = {2000, 5000, 10000} samples for each graph. We repeated each experiment 10 times and reported the mean and standard deviation.