Biclustering Gene Expressions Using Factor Graphs and the Max-Sum Algorithm
Authors: Matteo Denitto, Alessandro Farinelli, Manuele Bicego
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed approach has been evaluated using two sets of synthetic datasets and two real gene expression matrices. The empirical evaluation showed that our approach significantly outperforms previous stateof-the-art methods on both synthetic datasets and real data, demonstrating its practical significance. |
| Researcher Affiliation | Academia | Matteo Denitto, Alessandro Farinelli, Manuele Bicego Computer Science Department, University of Verona Verona, Italy |
| Pseudocode | No | The paper describes the mathematical formulations for message updates within the Max-Sum algorithm but does not present any structured pseudocode or an algorithm block labeled as such. |
| Open Source Code | No | The paper states: 'The code can be downloaded from http://sist.shanghaitech.edu.cn/faculty/tukw/sdm11code.zip' but this refers to the code for the 'EB' method, a baseline compared in the paper, not the proposed biclustering approach. |
| Open Datasets | Yes | The algorithm has been tested on two real gene expression datasets: the Yeast dataset [Gasch et al., 2000] and the Breast tumor data [Oghabian et al., 2014]. |
| Dataset Splits | No | The paper describes generating synthetic data and processing real gene expression matrices by applying the algorithm to submatrices, but it does not specify explicit train/validation/test dataset splits (e.g., 70% training, 15% validation, 15% test) for reproduction purposes. |
| Hardware Specification | No | The paper discusses computational capabilities and scaling issues of various methods but does not provide any specific hardware details such as CPU, GPU models, or memory specifications used for conducting the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies (e.g., programming languages, libraries, or solvers) with version numbers for the implementation of the proposed method. |
| Experiment Setup | Yes | Different convergence criteria can be used, here we stop the procedure when the variables configuration does not change for 100 consecutive iterations. 5 different noise levels (i.e. percentages) were used, ranging from 0 (no noise) to 0.2 (high noise). the proposed approach has been applied several times on randomly extracted submatrices (involving 10 rows and all the columns, with no overlap). For the Yeast dataset, only the first 100 biggest biclusters (with a maximum overlap degree of 25 percent) have been considered as part of the solution. For the Breast tumor dataset, only the first 40 largest biclusters have been considered as part of the solution. |