Feature Hashing for Network Representation Learning

Authors: Qixiang Wang, Shanfeng Wang, Maoguo Gong, Yue Wu

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Compared with the existing state-of-the-art network representation learning approaches, node2hash shows a competitive performance on multi-class node classification and link prediction tasks on three real-world networks from various domains.
Researcher Affiliation Academia Qixiang Wang1, Shanfeng Wang2, Maoguo Gong1 , Yue Wu3 1 Key Laboratory of Intelligent Perception and Image Understanding, Xidian University, Xi an 710071, China 2 School of Cyber Engineering, Xidian University, Xi an 710071, China 3 School of Computer Science and Technology, Xidian University, Xi an 710071, China omegawangqx@gmail.com, sfwang@xidian.edu.cn, gong@ieee.org, ywu@xidian.edu.cn
Pseudocode Yes The detailed pseudocode for building proximity matrix is given in Algorithm 1. ... Algorithm 2 The pseudocode of Extract Proximity ... The whole framework of the proposed node2hash algorithm is given in Algorithm 3.
Open Source Code No No explicit statement or link regarding the release of source code for the proposed method is found in the paper.
Open Datasets Yes There are three networks considered in our experiments. Citeseer [Mc Callum et al., 2000] is a citation network in which there are 3, 312 scientific publications classified into 6 classes and 4, 732 links among these. Cora [Mc Callum et al., 2000] is also a citation network which is composed of 2, 708 scientific publications from 7 classes and 5, 429 links. Wiki [Sen et al., 2008] contains 2, 405 web pages from 19 categories and 17, 981 links between them.
Dataset Splits No The paper describes training data and testing data (hidden edges) for link prediction, and varying training ratios for node classification, but does not explicitly mention a 'validation set' or a general validation strategy for their own model training. It mentions 10-fold cross-validation for hyperparameter tuning of the baseline 'node2vec', not for the proposed 'node2hash's general validation.
Hardware Specification No The paper does not specify any hardware details (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch x.x, scikit-learn x.x).
Experiment Setup Yes The number of walks denoted as n is set 10, the number of walk length denoted as l is set 80, and the window size denoted as w is set 10. ... We utilized the same setting as their paper (n = 10, l = 80, w = 10) and employed a grid search over return and in-out hyperparameters p, q {0.25, 0.5, 1, 1.5, 2} by 10-fold cross-validation. ... Our proposed node2hash owns a same parameter setting with naive feature (n = 10, l = 200, w = 50, T = 2). ... Note that the size of dimensionality d is set 256 for above four mentioned approaches excepted naive feature (d = |V |). ... We pay most attention on the impact of the size of hash set T. Therefore, we fix the number of walks, walk length and window size (n = 10, l = 200,w = 50) and test the effects of node2hash with different T.