Local Causal Structure Learning in the Presence of Latent Variables

Authors: Feng Xie, Zheng Li, Peng Wu, Yan Zeng, Chunchen Liu, Zhi Geng

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on both synthetic and real-world data validate the effectiveness and efficiency of our approach. To demonstrate the accuracy and efficiency of our algorithm, we compared the proposed MMB-by-MMB algorithm with global learning methods, such as PCstable (Colombo & Maathuis, 2014), FCI (Spirtes et al., 2000), and RFCI (Colombo et al., 2012) 4. We also compared with local learning methods, such as the MB-by-MB algorithm (Wang et al., 2014), the Causal Markov Blanket (CMB) algorithm (Gao & Ji, 2015), and the Gradie Nt-based LCS (Gra N-LCS) algorithm (Liang et al., 2023) 5.
Researcher Affiliation Collaboration Feng Xie 1 Zheng Li 1 Peng Wu 1 Yan Zeng 1 Chunchen Liu 2 Zhi Geng 1 1Department of Applied Statistics, Beijing Technology and Business University, Beijing, China 2Ling Yang Co.Ltd, Alibaba Group, Hangzhou, China.
Pseudocode Yes Algorithm 1 MMB-by-MMB; Algorithm 2 MMBalg(Pellet & Elisseeff, 2008)
Open Source Code Yes Our source code is available from https://github.com/ fengxie009/MMB-by-MMB.
Open Datasets Yes We select four networks ranging from low to high dimensionality: MILDEW, ALARM, WIN95PTS, and ANDES, containing 35, 37, 76, and 223 nodes, respectively6. The details of those networks can be found at https:// www.bnlearn.com/bnrepository/.; gene expression data from Wille et al. (2004)
Dataset Splits No The paper mentions 'Each experiment was repeated 100 times with randomly generated data' and sample sizes, but does not provide specific train/validation/test splits, their percentages, or cross-validation details for reproducibility.
Hardware Specification Yes All experiments were performed with Intel 2.90GHz and 2.89 GHz CPUs and 128 GB of memory.
Software Dependencies No The paper mentions software packages like 'causallearn' and 'pcalg', and uses the 'Total Conditioning (TC)' algorithm, but it does not specify exact version numbers for these software dependencies or other key components.
Experiment Setup No The paper describes the data generation process and selection of latent/target variables but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes) or concrete thresholds/parameters used for the conditional independence tests within the algorithm during experiments.