CoDiNMF: Co-Clustering of Directed Graphs via NMF
Authors: Woosang Lim, Rundong Du, Haesun Park
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run experiments on the US patents and Blog Catalog data sets whose ground truth have been known, and show that Co Di NMF improves clustering results compared to other co-clustering methods in terms of average F1 score, Rand index, and adjusted Rand index (ARI). |
| Researcher Affiliation | Academia | Woosang Lim School of Computational Science and Engineering Georgia Institute of Technology, Atlanta, GA 30332, USA woosang.lim@cc.gatech.edu Rundong Du School of Computational Science and Engineering Georgia Institute of Technology, Atlanta, GA 30332, USA rdu@gatech.edu Haesun Park School of Computational Science and Engineering Georgia Institute of Technology, Atlanta, GA 30332, USA hpark@cc.gatech.edu |
| Pseudocode | No | The paper describes the objective function and how it can be solved by NMF with relaxation, but no structured pseudocode or algorithm blocks are provided. |
| Open Source Code | No | No explicit statement or link providing access to the open-source code for the described methodology was found. |
| Open Datasets | Yes | For US patent data set, we use the cooperative patent classification (CPC) info (as illustrated in Fig 3) to generate the ground truth clusters. The subset of US patent data sets which we used is displayed Table 1, and they have the citation information among patents, the connections between patents and words, but no word-word relationship information. The Blog Catalog data set also provides ground truth information, and it contains entity-entity, entity-feature, and feature-feature relations. To compare co-clusters with ground truth clusters, we treat each co-cluster as a cluster of patents by ignoring the terms in the co-cluster. |
| Dataset Splits | No | No specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, cross-validation setup) are provided. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are provided in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) are provided in the paper. |
| Experiment Setup | Yes | For Co Di NMF, we set the ci parameters for balance parameter α and β with ci = {1, 1.3, 1.7, 2, 3}. For Co NMF, we set S and Z as zero matrices, since it is a degenerate version of Co Di NMF. The accuracies of four methods are compared in terms of F1 scores, Rand index, and adjusted Rand index in Table 2, Table 3, and Table 4 respectively. |