Text Classification with Heterogeneous Information Network Kernels
Authors: Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, Jiawei Han
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using Freebase, a well-known world knowledge base, to construct HIN for texts, our experiments on two benchmark datasets show that the indefinite HIN-kernel based on weighted meta-paths outperforms the state-of-the-art methods and other HIN-kernels. and Experiments In this section, we show empirically how to incorporate external knowledge into the HIN-kernels. |
| Researcher Affiliation | Academia | Chenguang Wanga, Yangqiu Songb, Haoran Lia, Ming Zhanga, Jiawei Hanc a School of EECS, Peking University b Lane Department of Computer Science and Electrical Engineering, West Virginia University c Department of Computer Science, University of Illinois at Urbana-Champaign |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Datasets We derive four classification problems from the two benchmark datasets as follow. 20Newsgroups (20NG): In the spirit of (Basu, Bilenko, and Mooney 2004), two datasets are created by selecting three categories from 20NG. RCV1: We derive two subsets of RCV1 (Lewis et al. 2004) from the top category GCAT (Government/Social). |
| Dataset Splits | Yes | Each data split has three binary classification tasks. For each task, the corresponding data is randomly divided into 80% training and 20% testing data. We apply 5fold cross validation on the training set to determine the optimal hyperparameter C for SVM and SVMHIN. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions Word2Vec, Naive Bayes, and SVM, but does not specify version numbers for any software libraries or dependencies used for implementation. |
| Experiment Setup | Yes | The parameters C and ρ for indefinite SVM are tuned based on the 5-fold cross validation and the Nesterov s efficient smooth optimization method (Nesterov 2005) is terminated if the value of the object function changes less than 10 6 following (Ying, Campbell, and Girolami 2009). |