Spectral Clustering in Heterogeneous Information Networks

Authors: Xiang Li, Ben Kao, Zhaochun Ren, Dawei Yin4221-4228

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments comparing SClump with other state-of-the-art clustering algorithms on HINs. Our results show that SClump outperforms the competitors over a range of datasets w.r.t. different clustering quality measures.
Researcher Affiliation Collaboration Xiang Li,1 Ben Kao,2 Zhaochun Ren,1 Dawei Yin1 1Data Science Lab, JD.com, China 2Department of Computer Science, The University of Hong Kong, Hong Kong {lixiang81, renzhaochun}@jd.com, kao@cs.hku.hk, yindawei@acm.org
Pseudocode Yes Algorithm 1 SClump
Open Source Code No The paper does not provide a link to its source code or state that it is open-source.
Open Datasets Yes We use three datasets Freebase, DBLP and Yelp in the experiments. Freebase is a knowledge base that models entities and their relationships as a graph. DBLP is a bibliographic network of scientific publications. Yelp is a business referral service, whose data includes various information of businesses such as customer reviews.
Dataset Splits No The paper describes how clustering quality is evaluated (NMI, purity, RI) but does not specify how the datasets were split into training, validation, or test sets for model development or evaluation, which is typical for supervised learning tasks, but less common for unsupervised clustering evaluated on the full dataset.
Hardware Specification No The paper does not provide any details regarding the hardware specifications (e.g., CPU, GPU, memory) used for conducting the experiments.
Software Dependencies No The paper mentions 'k-means' as a post-processing step and refers to other methods, but it does not specify any software dependencies or their version numbers required to reproduce the work (e.g., specific programming languages, libraries, or frameworks with versions).
Experiment Setup Yes For SClump, we set α = 0.1, β = 10 for Yelp-R and α = 0.5, β = 10 for other clustering tasks. Moreover, γ is set according to (Nie et al. 2016).