Learning A Structured Optimal Bipartite Graph for Co-Clustering
Authors: Feiping Nie, Xiaoqian Wang, Cheng Deng, Heng Huang
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive empirical results are presented to verify the effectiveness and robustness of our model. We conduct several experiments to evaluate the effectiveness and robustness of our model. On both synthetic and benchmark datasets we gain equivalent or even better clustering results than other related methods. 5 Experimental Results |
| Researcher Affiliation | Academia | 1 School of Computer Science, Center for OPTIMAL, Northwestern Polytechnical University, China 2 Department of Electrical and Computer Engineering, University of Pittsburgh, USA 3 School of Electronic Engineering, Xidian University, China feipingnie@gmail.com,xqwang1991@gmail.com chdeng@mail.xidian.edu.cn,heng.huang@pitt.edu |
| Pseudocode | Yes | Algorithm 1 Algorithm to solve the problem (15). Algorithm 2 Algorithm to solve the problem (23). |
| Open Source Code | No | The paper does not provide any links or explicit statements about the availability of open-source code for the described methodology. |
| Open Datasets | Yes | Reuters21578 dataset is processed and downloaded from http://www.cad.zju.edu.cn/ home/dengcai/Data/Text Data.html. LUNG dataset [1]. Prostate-MS dataset [15]. Prostate Cancer PSA410 dataset [10]. |
| Dataset Splits | No | The paper does not provide specific details about training, validation, or test dataset splits. It describes how synthetic data was generated and the properties of benchmark datasets, but not how they were partitioned for model training and evaluation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using specific methods (e.g., K-means, NCut, NMF, BSGP, ONMTF) but does not provide specific software names with version numbers for libraries or environments used for implementation. |
| Experiment Setup | Yes | For methods requiring a similarity graph as the input, i.e., NCut and NMF, we adopted the self-tuning Gaussian method [19] to construct the graph, where the number of neighbors was set to be 5 and the σ value was self-tuned. When running K-means we used 100 random initializations for all these four methods and recorded the average performance over these 100 runs as well as the best one with respect to the K-means objective function value. In our method, to accelerate the algorithmic procedure, we determined the parameter λ in an heuristic way: first specify the value of λ with an initial guess; next, we computed the number of zero eigenvalues in LS in each iteration, if it was larger than k, then divided λ by 2; if smaller then multiplied λ by 2; otherwise we stopped the iteration. The number of clusters was set to be the ground truth. Before the clustering process, feature scaling was performed on each dataset such that features are on the same scale of [0, 1]. Also, the ℓ2-norm of each feature was normalized to 1. |