reproducibilityindex.ai

ProNE: Fast and Scalable Network Representation Learning

Authors: Jie Zhang, Yuxiao Dong, Yan Wang, Jie Tang, Ming Ding

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the efﬁciency and effectiveness of the Pro NE method on multi-label node classiﬁcation a commonly used task for network embedding evaluation [Perozzi et al., 2014; Tang et al., 2015; Grover and Leskovec, 2016]. We conduct experiments in ﬁve real networks and a set of random graphs. Extensive demonstrations show that the one-thread Pro NE model is about 10 400 faster than popular network embedding benchmarks with 20 threads, including Deep Walk, LINE, node2vec (See Figure 1).
Researcher Affiliation	Collaboration	Jie Zhang1 , Yuxiao Dong2 , Yan Wang1 , Jie Tang1 and Ming Ding1 1Department of Computer Science and Technology, Tsinghua University 2Microsoft Research, Redmond
Pseudocode	No	The paper describes the model and its steps using text and mathematical equations, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	1Code is available at https://github.com/THUDM/Pro NE
Open Datasets	Yes	Dataset Blog Catalog Wiki PPI DBLP Youtube #nodes 10,312 4,777 3,890 51,264 1,138,499 #edges 333,983 184,812 76,584 127,968 2,990,443 #labels 39 40 50 60 47 Table 1: The statistics of datasets. Blog Catalog [Zafarani and Liu, 2009] is a social blogger network, in which Bloggers interests are used as labels. Wiki2 is a co-occurrence network of words in the ﬁrst million bytes of the Wikipedia dump. Node labels are the Partof-Speech tags. 2http://www.mattmahoney.net/dc/text.html PPI [Breitkreutz et al., 2008] is a subgraph of the PPI network for Homo Sapiens. DBLP [Tang et al., 2008] is an academic citation network... Youtube [Zafarani and Liu, 2009] is a social network...
Dataset Splits	No	We randomly sample different percentages of labeled nodes for training a liblinear classiﬁer and use the remaining for testing. (No explicit mention of a validation set or specific split percentages for training/testing/validation, only 'different percentages'.)
Hardware Specification	Yes	The experiments were conducted on a Red Hat server with Intel Xeon(R) CPU E5-4650 (2.70GHz) and 1T RAM.
Software Dependencies	No	Pro NE is implemented by Python 3.6.1. (It also mentions 'Sci Py package' but without a version number, and a single language version is not considered sufficient without other versioned libraries per the schema guidelines.)
Experiment Setup	Yes	For a fair comparison, we set the embedding dimension d = 128 for all methods. For the other parameters, we follow the original authors preferred choices. For Deep Walk and node2vec, windows size m=10, #walks per node r=80, walk length t=40. p, q in node2vec are searched over {0.25, 0.50, 1, 2, 4}. For LINE, #negative-samples k = 5 and total sampling budget T=r t \|V \|. For Gra Rep, the dimension of the concatenated embedding is d=128 for fairness. For HOPE, β is calculated in authors code and searched over (0, 1) for the best performance. For Pro NE, the term number of the Chebyshev expansion k is set to 10, µ=0.2, and θ=0.5, which are the default settings.