Local Causal Discovery of Direct Causes and Effects

Authors: Tian Gao, Qiang Ji

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show both theoretically and experimentally that the proposed local causal discovery algorithm can obtain the comparable identification accuracy as global methods but significantly improve their efficiency, often by more than one order of magnitude. We use benchmark causal learning datasets to evaluate the accuracy and efficiency of CMB with four other causal discovery algorithms discussed: P-C, GS, MMHC, CS, and the local causal discovery algorithm LCD2 [7].
Researcher Affiliation Academia Tian Gao Qiang Ji Department of ECSE Rensselaer Polytechnic Institute, Troy, NY 12180 {gaot, jiq}@rpi.edu
Pseudocode Yes Algorithm 1 Causal Markov Blanket Discovery Algorithm; Algorithm 2 Causal Search Subroutine
Open Source Code No The paper mentions implementing algorithms in MATLAB but does not provide any explicit statement about releasing the source code or a link to a code repository for the described methodology.
Open Datasets Yes We use benchmark causal learning datasets to evaluate the accuracy and efficiency of CMB with four other causal discovery algorithms discussed: P-C, GS, MMHC, CS, and the local causal discovery algorithm LCD2 [7]. Due to page limit, we show the results of the causal algorithms on four medium-to-large datasets: ALARM, ALARM3, CHILD3, and INSUR3.
Dataset Splits No The paper states 'We use 1000 data samples for all datasets' but does not specify any training, validation, or test splits (e.g., percentages or absolute counts) required for reproducibility.
Hardware Specification Yes We implement GS, CS, and the proposed CMB algorithms in MATLAB on a machine with 2.66GHz CPU and 24GB memory.
Software Dependencies No The paper mentions 'MATLAB' and 'HITON-MB discovery algorithm' but does not provide specific version numbers for any software components, which is necessary for a reproducible description of dependencies.
Experiment Setup Yes We use 1000 data samples for all datasets. We also use mutual-information-based conditional independence tests with a standard significance level of 0.02 for all the datasets without worrying about parameter tuning.