Efficient Attributed Network Embedding via Recursive Randomized Hashing
Authors: Wei Wu, Bin Li, Ling Chen, Chengqi Zhang
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experimental results show that the proposed algorithm, which does not need learning, runs significantly faster than the state-of-the-art learning-based network embedding methods while achieving competitive or even better performance in accuracy. |
| Researcher Affiliation | Academia | 1 Centre for Artificial Intelligence, University of Technology Sydney, Australia 2 School of Computer Science, Fudan University, China william.third.wu@gmail.com, libin@fudan.edu.cn, {ling.chen, chengqi.zhang}@uts.edu.au |
| Pseudocode | Yes | Algorithm 1 The Net Hash Algorithm Input: G = (V, E, f); number of embedding dimensions K; entropy of degrees of network S; depth of tree D 1; hash functions at the l-th level {π(l) k } D,kl l=0,k=1 Output: G s embedding h 1: for r = 1, . . . , |V| do 2: Build a parent pointer tree T for node r; 3: Initialize an empty auxiliary queue Q; 4: for v T do 5: l level of v in T; 6: merger f(v); // initial merger from attributes on v 7: while Q is not empty and v is the parent node of Q[0] in T do 8: merger merge(Q.pop().digest, merger); 9: end while 10: digest Min Hash(merger, {π(l) k } kl k=1); 11: Q. push({digest, v}); 12: end for 13: h(r) Q.pop().digest; 14: end for |
| Open Source Code | No | No explicit statement about the release of the source code for the proposed Net Hash algorithm, nor a link to a code repository, was found in the paper. |
| Open Datasets | Yes | Data sets: (1) Cora [Yang et al., 2015]: A citation network of machine learning papers. (2) Wikipedia [Yang et al., 2015]: A citation network of articles in Wikipedia. (3) Flickr [Li et al., 2015]: The network consists of users as nodes, following relationship as edges and interest tags of users as attributes. (4) Blog Catalog [Li et al., 2015]: The network consists of bloggers as nodes, following relationship as edges and keywords in blog as attributes. (5) ACM [Tang et al., 2008]: The original data contains 2,381,688 ACM papers and 10,476,564 citation relationship. |
| Dataset Splits | Yes | We vary the training ratio (i.e., percentage of nodes as the training set) in {50%, 60%, 70%, 80%, 90%}, for each ratio of which we repeat the experiment 10 times and average the results. |
| Hardware Specification | Yes | All experiments are conducted on a node of Linux Cluster with 8 3.4 GHz Intel Xeon CPU (64 bit) and 32GB RAM. |
| Software Dependencies | No | The paper mentions using LIBSVM and LIBLINEAR, but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | For all methods, we set the embedding dimension K = 200, as in TADW and CANE. ... Net Hash has two exclusive parameters, tree depth D and decay rate λ. ... We set D = 1 for Wikipedia, Flickr and Blog Catalog, and D = 2 for Cora and ACM. ... Hence, we adopt the entropy of node degrees S as the decay rate. |