Specific and Shared Causal Relation Modeling and Mechanism-Based Clustering
Authors: Biwei Huang, Kun Zhang, Pengtao Xie, Mingming Gong, Eric P. Xing, Clark Glymour
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on synthetic and real-world data demonstrate the efficacy of the proposed method. |
| Researcher Affiliation | Collaboration | Biwei Huang 1 , Kun Zhang1, Pengtao Xie2, Mingming Gong3, Eric Xing2,4, Clark Glymour1 1Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA, USA. 2Petuum Inc., USA. 3School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia. 4Department of Machine Learning, Carnegie Mellon University, Pittsburgh, PA, USA. |
| Pseudocode | No | The paper describes the SAEM algorithm and Gibbs sampling in text, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any specific statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | We applied our methods to the f MRI hippocampus data [27], which contains signals from six separate brain regions... [27] Poldrack and Laumann. https://openfmri.org/dataset/ds000031/, 2015. We applied the proposed method to multivariate flow cytometry data... [31] K. Sachs, O. Perez, D. Pe er, D. A. Lauffenburger, and G. P. Nolan. Causal protein signaling networks derived from multiparameter single-cell data. In Science, volume 308, pages 523 529, 2005. |
| Dataset Splits | No | The paper describes synthetic data generation parameters (e.g., number of groups, samples per individual, number of individuals) and characteristics of real-world datasets, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) for reproducibility. |
| Hardware Specification | No | The paper mentions performance scalability: 'currently we can handle 10 variables with 200 subjects within 1 hour,' but does not provide specific hardware details such as GPU/CPU models, processors, or memory specifications used for experiments. |
| Software Dependencies | No | The paper mentions various algorithms and methods (e.g., SAEM algorithm, Gibbs sampling, K-means, Li NGAM, MC, IB, Wilcoxon signed rank test) but does not provide specific software names with version numbers for reproducibility. |
| Experiment Setup | Yes | The parameters were set in the following way: In the 2-group case (q = 2), when the group index k = 1... µk,ij U(0.8, 1)... when k = 2... µk,i j U(0, 0.2)... σ2 k,ij U(0.01, 0.1), ω2 k,ij,p U(0.01, 0.1), each entry of µE k,k U( 0.6, 0.4) U(0.4, 0.6), each entry of ΣE k,k U(0.2, 0.5), πk U(0.3, 0.6), πE k,k U(0.3, 0.6)... In our method, we initialized the parameters in the following way: we first estimated the correlation matrix for each individual and clustered the estimated correlation matrices with K-means clustering, and then we used the estimated centroids of each group as the initial value of µk,ij. Other parameters were initialized randomly. M = 30. ˆGs ij = 1 if |ˆbs ij| > 0.1, and ˆGs ij = 0 if otherwise. |