Efficient and Effective Optimal Transport-Based Biclustering
Authors: Chakib Fettal, lazhar labiod, Mohamed NADIF
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We ran experiments using term-document matrices. The benefit of using biclustering on this kind of data is that the resulting biclusters contain both documents and the words that characterize them, which is helpful in interpreting the clustering of the documents. Additional experiments over synthetic and gene expression data are available in the appendix. |
| Researcher Affiliation | Academia | Chakib Fettal Centre Borelli UMR 9010 Université Paris Cité Informatique Caisse des Dépôts et Consignations chakib.fettal@etu.u-paris.fr Lazhar Labiod Centre Borelli UMR 9010 Université Paris Cité lazhar.labiod@u-paris.fr Mohamed Nadif Centre Borelli UMR 9010 Université Paris Cité mohamed.nadif@u-paris.fr |
| Pseudocode | Yes | Algorithm 1: BCOT Input :B bi-adjacency matrix, w and v row and column weights, r and c row and column exemplar distributions Output :πr, πc row and column partitions W Winit; while not converged do Z arg OT (L(B)W, w, r); W arg OT L(B) Z, v, c ; end Generate πr, πc from Z and W; |
| Open Source Code | Yes | For reproducibility, we publicly release our code 2. 2https://github.com/chakib401/BCOT |
| Open Datasets | Yes | We evaluate BCOT in relation to six benchmark document-term datasets: ACM, DBLP, Pub Med, Wiki, Ohscal, and 20 Newsgroups. Their characteristics are shown in Table 2. ACM [13], DBLP [13], Pubmed [32] and Wiki [37] are attributed networks from which we use only the node-level features that correspond to term-document matrices. We also selected the Ohscal collection [22] and 20 Newsgroups [26] as large-scale document-term matrices to serve as computational efficiency benchmarks. |
| Dataset Splits | No | The paper uses benchmark datasets but does not explicitly state the training, validation, or test split percentages or sample counts used for reproducing the experiments. |
| Hardware Specification | Yes | All the experiments were performed on the same machine with an Intel(R) Xeon(R) CPU and 12GB RAM. |
| Software Dependencies | No | For OT solvers we made use of the POT package [15]. |
| Experiment Setup | Yes | In our experiments we define the loss function as L(B) = c B, where c is selected from {1, k, d, n}. For BCOTλ, the regularization parameter lambda is selected from {10 4, 10 3, 10 2, 10 1, 1, 10}. The best hyper-parameters are those that minimize the number of empty clusters. In the case of ties, we select according to the value of the Davies-Bouldin index of the partition [7]. Random restarts are not used for any of the algorithms, including k-means. |