On Sparse Gaussian Chain Graph Models
Authors: Calvin McCarter, Seyoung Kim
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate our approach on simulated and genomic datasets. In simulation study, we considered two scenarios for true models, CGGM-based and linear-regression-based Gaussian chain graph models. We evaluated the performance in terms of graph structure recovery and prediction accuracy in both supervised and semi-supervised settings. We applied the two types of three-layer chain graph models to single-nucleotide-polymorphism (SNP), gene-expression, and phenotype data from the pancreatic islets study for diabetic mice [18]. |
| Researcher Affiliation | Academia | Calvin Mc Carter Machine Learning Department Carnegie Mellon University calvinm@cmu.edu Seyoung Kim Lane Center for Computational Biology Carnegie Mellon University sssykim@cs.cmu.edu |
| Pseudocode | No | No pseudocode or algorithm blocks explicitly labeled as such were found in the paper. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for their methodology is publicly available. |
| Open Datasets | Yes | We applied the two types of three-layer chain graph models to single-nucleotide-polymorphism (SNP), gene-expression, and phenotype data from the pancreatic islets study for diabetic mice [18]. |
| Dataset Splits | Yes | Of the total 506 samples, we used 406 as training set, of which 100 were held out as a validation set to select regularization parameters, and used the remaining 100 samples as test set to evaluate prediction accuracies. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'the optimization methods in [20] for CGGMbased models and the MRCE procedure [16] for linearregression-based models', but does not provide specific software names with version numbers. |
| Experiment Setup | Yes | In order to simulate data, we assumed the problem size of J=500, K=100, and L=50 for x, y, and z, respectively, and generated samples from known true models. Each dataset consisted of 600 samples, of which 400 and 200 samples were used as training and test sets. To select the regularization parameters, we estimated a model using 300 samples, evaluated prediction errors on the other 100 samples in the training set, and selected the values with the lowest prediction errors. |