Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient Attributed Network Embedding via Recursive Randomized Hashing
Authors: Wei Wu, Bin Li, Ling Chen, Chengqi Zhang
IJCAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experimental results show that the proposed algorithm, which does not need learning, runs significantly faster than the state-of-the-art learning-based network embedding methods while achieving competitive or even better performance in accuracy. |
| Researcher Affiliation | Academia | 1 Centre for Artificial Intelligence, University of Technology Sydney, Australia 2 School of Computer Science, Fudan University, China EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 The Net Hash Algorithm Input: G = (V, E, f); number of embedding dimensions K; entropy of degrees of network S; depth of tree D 1; hash functions at the l-th level {π(l) k } D,kl l=0,k=1 Output: G s embedding h 1: for r = 1, . . . , |V| do 2: Build a parent pointer tree T for node r; 3: Initialize an empty auxiliary queue Q; 4: for v T do 5: l level of v in T; 6: merger f(v); // initial merger from attributes on v 7: while Q is not empty and v is the parent node of Q[0] in T do 8: merger merge(Q.pop().digest, merger); 9: end while 10: digest Min Hash(merger, {π(l) k } kl k=1); 11: Q. push({digest, v}); 12: end for 13: h(r) Q.pop().digest; 14: end for |
| Open Source Code | No | No explicit statement about the release of the source code for the proposed Net Hash algorithm, nor a link to a code repository, was found in the paper. |
| Open Datasets | Yes | Data sets: (1) Cora [Yang et al., 2015]: A citation network of machine learning papers. (2) Wikipedia [Yang et al., 2015]: A citation network of articles in Wikipedia. (3) Flickr [Li et al., 2015]: The network consists of users as nodes, following relationship as edges and interest tags of users as attributes. (4) Blog Catalog [Li et al., 2015]: The network consists of bloggers as nodes, following relationship as edges and keywords in blog as attributes. (5) ACM [Tang et al., 2008]: The original data contains 2,381,688 ACM papers and 10,476,564 citation relationship. |
| Dataset Splits | Yes | We vary the training ratio (i.e., percentage of nodes as the training set) in {50%, 60%, 70%, 80%, 90%}, for each ratio of which we repeat the experiment 10 times and average the results. |
| Hardware Specification | Yes | All experiments are conducted on a node of Linux Cluster with 8 3.4 GHz Intel Xeon CPU (64 bit) and 32GB RAM. |
| Software Dependencies | No | The paper mentions using LIBSVM and LIBLINEAR, but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | For all methods, we set the embedding dimension K = 200, as in TADW and CANE. ... Net Hash has two exclusive parameters, tree depth D and decay rate λ. ... We set D = 1 for Wikipedia, Flickr and Blog Catalog, and D = 2 for Cora and ACM. ... Hence, we adopt the entropy of node degrees S as the decay rate. |