reproducibilityindex.ai

On Sparse Gaussian Chain Graph Models

Authors: Calvin McCarter, Seyoung Kim

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate our approach on simulated and genomic datasets. In simulation study, we considered two scenarios for true models, CGGM-based and linear-regression-based Gaussian chain graph models. We evaluated the performance in terms of graph structure recovery and prediction accuracy in both supervised and semi-supervised settings. We applied the two types of three-layer chain graph models to single-nucleotide-polymorphism (SNP), gene-expression, and phenotype data from the pancreatic islets study for diabetic mice [18].
Researcher Affiliation	Academia	Calvin Mc Carter Machine Learning Department Carnegie Mellon University calvinm@cmu.edu Seyoung Kim Lane Center for Computational Biology Carnegie Mellon University sssykim@cs.cmu.edu
Pseudocode	No	No pseudocode or algorithm blocks explicitly labeled as such were found in the paper.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for their methodology is publicly available.
Open Datasets	Yes	We applied the two types of three-layer chain graph models to single-nucleotide-polymorphism (SNP), gene-expression, and phenotype data from the pancreatic islets study for diabetic mice [18].
Dataset Splits	Yes	Of the total 506 samples, we used 406 as training set, of which 100 were held out as a validation set to select regularization parameters, and used the remaining 100 samples as test set to evaluate prediction accuracies.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper mentions using 'the optimization methods in [20] for CGGMbased models and the MRCE procedure [16] for linearregression-based models', but does not provide specific software names with version numbers.
Experiment Setup	Yes	In order to simulate data, we assumed the problem size of J=500, K=100, and L=50 for x, y, and z, respectively, and generated samples from known true models. Each dataset consisted of 600 samples, of which 400 and 200 samples were used as training and test sets. To select the regularization parameters, we estimated a model using 300 samples, evaluated prediction errors on the other 100 samples in the training set, and selected the values with the lowest prediction errors.