Biclustering Using Message Passing

Authors: Luke O'Connor, Soheil Feizi

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In simulations, we find that our method outperforms two of the best existing biclustering algorithms, ISA and LAS, when the planted clusters overlap. Applied to three gene expression datasets, our method finds coregulated gene clusters that have high quality in terms of cluster size and density.
Researcher Affiliation Academia Luke O Connor Bioinformatics and Integrative Genomics Harvard University Cambridge, MA 02138 loconnor@g.harvard.edu Soheil Feizi Electrical Engineering and Computer Science Massachusetts Institute of Technology Cambridge, MA 02139 sfeizi@mit.edu
Pseudocode Yes Pseudocode for BCMP is presented in Supplementary Note 10.
Open Source Code No The paper does not provide a direct link to the source code or an explicit statement of its release for the described methodology.
Open Datasets No The paper mentions 'three gene expression datasets' and 'simulated bipartite graphs' but does not provide concrete access information (e.g., names, citations, links, or specific file names) for any public dataset used.
Dataset Splits No The paper does not provide specific details regarding training, validation, or test dataset splits (e.g., percentages, sample counts, or references to predefined splits).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or cloud instance types) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names with versions) needed to replicate the experiment.
Experiment Setup Yes Here are the parameters that we used to run each method: BCMP method with underlying parameters given: We computed the input matrix of shifted log-likelihood ratios following the discussion in Section 2.2. The number of biclusters K was given. We initialized the cluster-shape parameters rk at 1 and updated them as discussed in Supplementary Note 3.1. In the case of Bernoulli noise, following Proposition 2 and Remark 1, we set ℓij = eij and δ2 = 1/4. In the case of Gaussian noise, we chose a threshold δ to maximize the unthresholded likelihood (see Supplementary Note 3.2). BCMP EM method: Instead of taking the underlying model parameters as given, we estimated them using the procedure described in Section 2.4 and Supplementary Note 6. ISA method: We used the same threshold ranges for both rows and columns, attempting to find best-performing threshold values for each noise level. These values were mostly around 1.5 for both noise types and for all three dataset types. We found positive biclusters, and used 20 reinitializations. Out of these 20 runs, we selected the best-performing run. LAS method: There were no parameters to set. Since K was given, we selected the first K biclusters discovered by LAS, which marginally increased its performance.