Learning Modular Structures from Network Data and Node Variables
Authors: Elham Azizi, Edoardo Airoldi, James Galagan
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the method accuracy in predicting modular structures from synthetic data and capability to learn regulatory modules in the Mycobacterium tuberculosis gene regulatory network. |
| Researcher Affiliation | Academia | Elham Azizi ELHAM@BU.EDU Bioinformatics Program, Boston University, Boston, MA 02215 USA; Edoardo M. Airoldi AIROLDI@FAS.HARVARD.EDU Department of Statistics, Harvard University, Camrbdige, MA 02138 USA; James E. Galagan JGALAG@BU.EDU Departments of Biomedical Engineering and Microbiology, Boston University, Boston, MA 02215 USA |
| Pseudocode | Yes | Algorithm 1 RJMCMC for sampling parameters |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We used interaction data identified with Ch IP-Seq of 50 MTB transcription factors and expression data for different induction levels of the same factors in 87 experiments, from a recent study by (Galagan et al., 2013). |
| Dataset Splits | No | The paper discusses synthetic data generation and application to real-world data, but it does not explicitly provide details on training, validation, and test dataset splits (e.g., percentages or counts) needed for reproduction. |
| Hardware Specification | Yes | It takes an average of 36 8 seconds to generate 100 samples for N = 200, C = 50, R = 10 on an i5 3.30GHz Intel(R). |
| Software Dependencies | No | We used Matlab-MPI for this implementation. The software is named, but no specific version numbers are provided. |
| Experiment Setup | Yes | The inference procedure was run for 20,000 samples. Exponential prior distributions were used for number of parents assigned to each module, to avoid over-fitting. [...] module assignments were initialized by k-means clustering of variables. [...] We performed 100,000 iterations on the combination of the two datasets. [...] We set the maxmimum number of modules to 10 and constrained the candidate pool of regulators to the 50 Ch IPped regulators only. |