reproducibilityindex.ai

Learning Modular Structures from Network Data and Node Variables

Authors: Elham Azizi, Edoardo Airoldi, James Galagan

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the method accuracy in predicting modular structures from synthetic data and capability to learn regulatory modules in the Mycobacterium tuberculosis gene regulatory network.
Researcher Affiliation	Academia	Elham Azizi ELHAM@BU.EDU Bioinformatics Program, Boston University, Boston, MA 02215 USA; Edoardo M. Airoldi AIROLDI@FAS.HARVARD.EDU Department of Statistics, Harvard University, Camrbdige, MA 02138 USA; James E. Galagan JGALAG@BU.EDU Departments of Biomedical Engineering and Microbiology, Boston University, Boston, MA 02215 USA
Pseudocode	Yes	Algorithm 1 RJMCMC for sampling parameters
Open Source Code	No	The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We used interaction data identiﬁed with Ch IP-Seq of 50 MTB transcription factors and expression data for different induction levels of the same factors in 87 experiments, from a recent study by (Galagan et al., 2013).
Dataset Splits	No	The paper discusses synthetic data generation and application to real-world data, but it does not explicitly provide details on training, validation, and test dataset splits (e.g., percentages or counts) needed for reproduction.
Hardware Specification	Yes	It takes an average of 36 8 seconds to generate 100 samples for N = 200, C = 50, R = 10 on an i5 3.30GHz Intel(R).
Software Dependencies	No	We used Matlab-MPI for this implementation. The software is named, but no specific version numbers are provided.
Experiment Setup	Yes	The inference procedure was run for 20,000 samples. Exponential prior distributions were used for number of parents assigned to each module, to avoid over-ﬁtting. [...] module assignments were initialized by k-means clustering of variables. [...] We performed 100,000 iterations on the combination of the two datasets. [...] We set the maxmimum number of modules to 10 and constrained the candidate pool of regulators to the 50 Ch IPped regulators only.