A Simultaneous Discover-Identify Approach to Causal Inference in Linear Models

Authors: Chi Zhang, Bryant Chen, Judea Pearl10318-10325

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To quantify this improvement, we implemented LCDI and the version of FCI by Zhang (2008). We randomly generate DAGs with number of nodes (n) from 6 to 11 with various average node degrees (d), and an edge being directed and bidirected both have probability 0.5. We then compare the patterns that would be learned on the generated DAG by each method assuming faithfulness. More specifically, we compare the number of invariant arrowheads and extraneous edges learned. Each data entry in Tables 1 and 2 was averaged over 200 random DAGs.
Researcher Affiliation Collaboration Chi Zhang,1 Bryant Chen,2 Judea Pearl1 1Department of Computer Science, University of California, Los Angeles, California, USA 2Brex, San Francisco, California, USA
Pseudocode Yes Linear Causal Discovery and Identification (LCDI) Input: covariance matrix σV on the set of observed variables V and a set of identified edges Eid (can be empty) Output: a pattern P and updated Eid Step 0: Run FCI algorithm (Zhang 2008) on σV with Rules R1-R4 only, but replacing R4 with R4 given below. The resulting pattern is P; Step 1: Run the original FCI algorithm on σV with Rules R1-R4 and R8-R10 to obtain a PAG P , and merge the arrowheads in P to P; Step 2: Repeat the following Substeps on P until neither P nor Eid is updating; Substep 0: Perform causal identification on P without extraneous edges and update Eid; Substep 1: Generate AVs using Eid; Substep 2: Run Rules 0-3; Substep 3: Run FCI algorithm R1 and R4+ (given below) repeatedly until P is not updating; Step 3: Remove all the extraneous edges marked in Rule 0 in Step 1 Substep 2 from P.
Open Source Code No The paper does not include any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper states: "We randomly generate DAGs with number of nodes (n) from 6 to 11 with various average node degrees (d), and an edge being directed and bidirected both have probability 0.5." This indicates that data was simulated/generated for the experiments, not obtained from a publicly available dataset.
Dataset Splits No The paper describes simulation experiments on randomly generated DAGs to compare the learned patterns. It does not discuss training, validation, or test dataset splits in the typical machine learning sense for model training and evaluation.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies No The paper mentions implementing "LCDI and the version of FCI by Zhang (2008)", but it does not specify any software names with version numbers (e.g., programming languages, libraries, frameworks, or specific tool versions).
Experiment Setup No The paper describes how the DAGs were randomly generated for simulation purposes ("number of nodes (n) from 6 to 11 with various average node degrees (d), and an edge being directed and bidirected both have probability 0.5"), but it does not provide specific experimental setup details such as hyperparameters, optimizer settings, or other system-level training configurations for a learning algorithm.