CGD: Multi-View Clustering via Cross-View Graph Diffusion

Authors: Chang Tang, Xinwang Liu, Xinzhong Zhu, En Zhu, Zhigang Luo, Lizhe Wang, Wen Gao5924-5931

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on several benchmark datasets are conducted to demonstrate the effectiveness of the proposed method in terms of seven clustering evaluation metrics.
Researcher Affiliation Academia 1School of Computer Science, China University of Geosciences, Wuhan 430074, China 2School of Computer Science, National University of Defense Technology, Changsha 410073, China 3College of Mathematics, Physics and Information Engineering, Zhejiang Normal University, Jinhua 321004, China 4School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes https://github.com/Chang Tang/CGD
Open Datasets Yes Six real-world benchmark datasets are used to evaluate the performance of our CGD. They are as follows: BBCSport consists of documents...; MSRCV1 consists of 210 images...; 100leaves consists of 1600 samples...; 3sources consists of 169 news...; Scene-15 consists of 15 scene categories...; Reuters consists of 18758 samples... 1https://archive.ics.uci.edu/ml/datasets/Onehundred+plant+species+leaves+data+set 2http://mlg.ucd.ie/datasets/3sources.html
Dataset Splits No The paper does not specify exact split percentages, absolute sample counts for each split, or reference predefined splits with citations for training, validation, and test sets. It only mentions using benchmark datasets for evaluation.
Hardware Specification Yes We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used for computation acceleration for this research.
Software Dependencies No The paper does not list specific software components with their version numbers required for replication.
Experiment Setup Yes Without loss of generality, we use the commonly used Gaussian kernel function with Euclidean distance to generate initial view-specific graphs. σ in the Gaussian kernel function is set to 0.5. Seven widely used metrics are used to evaluate the performance: clustering accuracy (ACC), normalized mutual information (NMI), purity, precision, recall, F-score, and adjusted rand index (ARI). For these metrics, the larger value indicates the better clustering performance. We run each algorithm 10 times with the optimal parameters and report the means and standard deviations of the performance measures.