Text Classification with Heterogeneous Information Network Kernels

Authors: Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, Jiawei Han

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using Freebase, a well-known world knowledge base, to construct HIN for texts, our experiments on two benchmark datasets show that the indefinite HIN-kernel based on weighted meta-paths outperforms the state-of-the-art methods and other HIN-kernels. and Experiments In this section, we show empirically how to incorporate external knowledge into the HIN-kernels.
Researcher Affiliation Academia Chenguang Wanga, Yangqiu Songb, Haoran Lia, Ming Zhanga, Jiawei Hanc a School of EECS, Peking University b Lane Department of Computer Science and Electrical Engineering, West Virginia University c Department of Computer Science, University of Illinois at Urbana-Champaign
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes Datasets We derive four classification problems from the two benchmark datasets as follow. 20Newsgroups (20NG): In the spirit of (Basu, Bilenko, and Mooney 2004), two datasets are created by selecting three categories from 20NG. RCV1: We derive two subsets of RCV1 (Lewis et al. 2004) from the top category GCAT (Government/Social).
Dataset Splits Yes Each data split has three binary classification tasks. For each task, the corresponding data is randomly divided into 80% training and 20% testing data. We apply 5fold cross validation on the training set to determine the optimal hyperparameter C for SVM and SVMHIN.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions Word2Vec, Naive Bayes, and SVM, but does not specify version numbers for any software libraries or dependencies used for implementation.
Experiment Setup Yes The parameters C and ρ for indefinite SVM are tuned based on the 5-fold cross validation and the Nesterov s efficient smooth optimization method (Nesterov 2005) is terminated if the value of the object function changes less than 10 6 following (Ying, Campbell, and Girolami 2009).