Biclustering Gene Expressions Using Factor Graphs and the Max-Sum Algorithm

Authors: Matteo Denitto, Alessandro Farinelli, Manuele Bicego

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The proposed approach has been evaluated using two sets of synthetic datasets and two real gene expression matrices. The empirical evaluation showed that our approach significantly outperforms previous stateof-the-art methods on both synthetic datasets and real data, demonstrating its practical significance.
Researcher Affiliation Academia Matteo Denitto, Alessandro Farinelli, Manuele Bicego Computer Science Department, University of Verona Verona, Italy
Pseudocode No The paper describes the mathematical formulations for message updates within the Max-Sum algorithm but does not present any structured pseudocode or an algorithm block labeled as such.
Open Source Code No The paper states: 'The code can be downloaded from http://sist.shanghaitech.edu.cn/faculty/tukw/sdm11code.zip' but this refers to the code for the 'EB' method, a baseline compared in the paper, not the proposed biclustering approach.
Open Datasets Yes The algorithm has been tested on two real gene expression datasets: the Yeast dataset [Gasch et al., 2000] and the Breast tumor data [Oghabian et al., 2014].
Dataset Splits No The paper describes generating synthetic data and processing real gene expression matrices by applying the algorithm to submatrices, but it does not specify explicit train/validation/test dataset splits (e.g., 70% training, 15% validation, 15% test) for reproduction purposes.
Hardware Specification No The paper discusses computational capabilities and scaling issues of various methods but does not provide any specific hardware details such as CPU, GPU models, or memory specifications used for conducting the experiments.
Software Dependencies No The paper does not provide specific software dependencies (e.g., programming languages, libraries, or solvers) with version numbers for the implementation of the proposed method.
Experiment Setup Yes Different convergence criteria can be used, here we stop the procedure when the variables configuration does not change for 100 consecutive iterations. 5 different noise levels (i.e. percentages) were used, ranging from 0 (no noise) to 0.2 (high noise). the proposed approach has been applied several times on randomly extracted submatrices (involving 10 rows and all the columns, with no overlap). For the Yeast dataset, only the first 100 biggest biclusters (with a maximum overlap degree of 25 percent) have been considered as part of the solution. For the Breast tumor dataset, only the first 40 largest biclusters have been considered as part of the solution.