reproducibilityindex.ai

Efficient Attributed Network Embedding via Recursive Randomized Hashing

Authors: Wei Wu, Bin Li, Ling Chen, Chengqi Zhang

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experimental results show that the proposed algorithm, which does not need learning, runs signiﬁcantly faster than the state-of-the-art learning-based network embedding methods while achieving competitive or even better performance in accuracy.
Researcher Affiliation	Academia	1 Centre for Artiﬁcial Intelligence, University of Technology Sydney, Australia 2 School of Computer Science, Fudan University, China william.third.wu@gmail.com, libin@fudan.edu.cn, {ling.chen, chengqi.zhang}@uts.edu.au
Pseudocode	Yes	Algorithm 1 The Net Hash Algorithm Input: G = (V, E, f); number of embedding dimensions K; entropy of degrees of network S; depth of tree D 1; hash functions at the l-th level {π(l) k } D,kl l=0,k=1 Output: G s embedding h 1: for r = 1, . . . , \|V\| do 2: Build a parent pointer tree T for node r; 3: Initialize an empty auxiliary queue Q; 4: for v T do 5: l level of v in T; 6: merger f(v); // initial merger from attributes on v 7: while Q is not empty and v is the parent node of Q[0] in T do 8: merger merge(Q.pop().digest, merger); 9: end while 10: digest Min Hash(merger, {π(l) k } kl k=1); 11: Q. push({digest, v}); 12: end for 13: h(r) Q.pop().digest; 14: end for
Open Source Code	No	No explicit statement about the release of the source code for the proposed Net Hash algorithm, nor a link to a code repository, was found in the paper.
Open Datasets	Yes	Data sets: (1) Cora [Yang et al., 2015]: A citation network of machine learning papers. (2) Wikipedia [Yang et al., 2015]: A citation network of articles in Wikipedia. (3) Flickr [Li et al., 2015]: The network consists of users as nodes, following relationship as edges and interest tags of users as attributes. (4) Blog Catalog [Li et al., 2015]: The network consists of bloggers as nodes, following relationship as edges and keywords in blog as attributes. (5) ACM [Tang et al., 2008]: The original data contains 2,381,688 ACM papers and 10,476,564 citation relationship.
Dataset Splits	Yes	We vary the training ratio (i.e., percentage of nodes as the training set) in {50%, 60%, 70%, 80%, 90%}, for each ratio of which we repeat the experiment 10 times and average the results.
Hardware Specification	Yes	All experiments are conducted on a node of Linux Cluster with 8 3.4 GHz Intel Xeon CPU (64 bit) and 32GB RAM.
Software Dependencies	No	The paper mentions using LIBSVM and LIBLINEAR, but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	For all methods, we set the embedding dimension K = 200, as in TADW and CANE. ... Net Hash has two exclusive parameters, tree depth D and decay rate λ. ... We set D = 1 for Wikipedia, Flickr and Blog Catalog, and D = 2 for Cora and ACM. ... Hence, we adopt the entropy of node degrees S as the decay rate.