Sound and Complete Causal Identification with Latent Variables Given Local Background Knowledge
Authors: Tian-Zuo Wang, Tian Qin, Zhi-Hua Zhou
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare the maximum entropy criterion with a baseline random criterion where we randomly select one variable with circles to intervene in each round. We show the results in Tab. 1. # int. denotes the number of interventions to achieve MAG identification. The effectiveness of the maximum entropy criterion is verified by noting that the number of interventions with maximum entropy criterion is fewer than that with random criterion. Further, we evaluate the three stages respectively. In Stage 1, we obtain a PAG by running FCI algorithm with a significance level of 0.05. In Stage 2, we adopt the two criteria to select intervention variables. In Stage 3, we learn the marks with corresponding interventional data and orientation rules. We evaluate the performance of Stage 1 by # correct PAG/# wrong PAG. # correct PAG/# wrong PAG denotes the number of edges that are correctly/wrongly identified by FCI. An edge is correctly/wrongly identified by FCI if the edge learned by FCI is identical/not identical to the true PAG. The performance of Stage 2 is evaluated by # int.. And we evaluate the performance of Stage 3 by # correct int./# wrong int., where # correct int./# wrong int. denotes the number of edges whose direction are correctly/wrongly identified by interventions. An edge is correctly/wrongly identified by interventions if its existence is correctly identified in P but the direction is uncertain, and after interventions we learn its direction correctly/wrongly. We evaluate the performance of the whole process by Norm. SHD and F1. Norm. SHD denotes the normalized structural hamming distance (SHD), which is calculated by dividing SHD by d(d 1)/2. F1 score is calculated by the confusion matrix to indicate whether the edge between any two vertices is correctly learned. According to the SHD and F1 score, the active framework can learn the MAG accurately when p is not large. |
| Researcher Affiliation | Academia | Tian-Zuo Wang, Tian Qin, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China. |
| Pseudocode | Yes | Algorithm 1: Update a PMG with local background knowledge; Algorithm 2: Intervention variable selection based on maximum entropy criterion with MH alg. |
| Open Source Code | No | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] It will be publicly available later. |
| Open Datasets | No | We generate 100 Erdös-Rényi random DAGs for each setting, where the number of variables d = 10 and the probability of including each edge p {0.1, 0.15, 0.2, 0.25, 0.3}. The weight of each edge is drawn from U[1, 2]. We generate 10000 samples from the linear structural equations, and take three variables as latent variables and the others as observed ones. |
| Dataset Splits | No | The paper mentions generating 10000 samples but does not specify explicit training, validation, or test dataset splits, nor does it refer to standard, pre-defined splits. |
| Hardware Specification | No | The paper states: "Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No]". No specific hardware details are provided in the main text. |
| Software Dependencies | No | The paper mentions algorithms like FCI and Metropolis-Hastings but does not list specific software libraries, frameworks, or solvers with their version numbers. |
| Experiment Setup | Yes | In the implementation of the MH algorithm in Alg. 2, we discard the first 500 sampled MAGs and collect the following 1000 MAGs. For each intervention variable X, we collect 10000 samples under do(X = 2), and learn the circles at X by two-sample test with a significance level of 0.05. In Stage 1, we obtain a PAG by running FCI algorithm with a significance level of 0.05. |