Latent Hierarchical Causal Structure Discovery with Rank Constraints
Authors: Biwei Huang, Charles Jia Han Low, Feng Xie, Clark Glymour, Kun Zhang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose an estimation procedure that can efficiently locate latent variables, determine their cardinalities, and identify the latent hierarchical structure, by leveraging rank deficiency constraints over the measured variables. We show that the proposed algorithm can find the correct Markov equivalence class of the whole graph asymptotically under proper restrictions on the graph structure. ... In Section 5, we empirically validate the proposed approach on synthetic data. |
| Researcher Affiliation | Academia | 1 Carnegie Mellon University 2 Mohamed bin Zayed University of Artificial Intelligence 3 Beijing Technology and Business University, China |
| Pseudocode | Yes | Algorithm 1: Latent Hierarchical Causal Structure Discovery; Algorithm 2: Phase I: find Causal Clusters; Algorithm 3: Phase II: refine Clusters; Algorithm 4: Phase III: refine Edges |
| Open Source Code | No | Code will be made public. |
| Open Datasets | No | The paper uses "synthetic data" generated by the authors, which is not a publicly available dataset with concrete access information provided. |
| Dataset Splits | No | The paper mentions "synthetic data" and "sample sizes" (2k, 5k, 10k) but does not specify explicit training, validation, or test dataset splits (e.g., percentages or exact counts for each split). |
| Hardware Specification | Yes | All experiments were performed on a server equipped with an Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz, NVIDIA Quadro RTX 6000 GPU and 256GB RAM. |
| Software Dependencies | No | All our code is implemented in Python and we used standard libraries such as numpy, scipy, and scikit-learn for basic operations. |
| Experiment Setup | Yes | The causal strength was generated uniformly from [-5, -0.5] U [0.5, 5], and the noise term was randomly chosen from a normal distribution with noise variance uniformly sampled from [1, 5]. We generated 100 random IL2H graphs for each setting... We simulated N = {2000, 5000, 10000} samples for each graph. We repeated each experiment 10 times and reported the mean and standard deviation. |