ProNE: Fast and Scalable Network Representation Learning
Authors: Jie Zhang, Yuxiao Dong, Yan Wang, Jie Tang, Ming Ding
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the efficiency and effectiveness of the Pro NE method on multi-label node classification a commonly used task for network embedding evaluation [Perozzi et al., 2014; Tang et al., 2015; Grover and Leskovec, 2016]. We conduct experiments in five real networks and a set of random graphs. Extensive demonstrations show that the one-thread Pro NE model is about 10 400 faster than popular network embedding benchmarks with 20 threads, including Deep Walk, LINE, node2vec (See Figure 1). |
| Researcher Affiliation | Collaboration | Jie Zhang1 , Yuxiao Dong2 , Yan Wang1 , Jie Tang1 and Ming Ding1 1Department of Computer Science and Technology, Tsinghua University 2Microsoft Research, Redmond |
| Pseudocode | No | The paper describes the model and its steps using text and mathematical equations, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Code is available at https://github.com/THUDM/Pro NE |
| Open Datasets | Yes | Dataset Blog Catalog Wiki PPI DBLP Youtube #nodes 10,312 4,777 3,890 51,264 1,138,499 #edges 333,983 184,812 76,584 127,968 2,990,443 #labels 39 40 50 60 47 Table 1: The statistics of datasets. Blog Catalog [Zafarani and Liu, 2009] is a social blogger network, in which Bloggers interests are used as labels. Wiki2 is a co-occurrence network of words in the first million bytes of the Wikipedia dump. Node labels are the Partof-Speech tags. 2http://www.mattmahoney.net/dc/text.html PPI [Breitkreutz et al., 2008] is a subgraph of the PPI network for Homo Sapiens. DBLP [Tang et al., 2008] is an academic citation network... Youtube [Zafarani and Liu, 2009] is a social network... |
| Dataset Splits | No | We randomly sample different percentages of labeled nodes for training a liblinear classifier and use the remaining for testing. (No explicit mention of a validation set or specific split percentages for training/testing/validation, only 'different percentages'.) |
| Hardware Specification | Yes | The experiments were conducted on a Red Hat server with Intel Xeon(R) CPU E5-4650 (2.70GHz) and 1T RAM. |
| Software Dependencies | No | Pro NE is implemented by Python 3.6.1. (It also mentions 'Sci Py package' but without a version number, and a single language version is not considered sufficient without other versioned libraries per the schema guidelines.) |
| Experiment Setup | Yes | For a fair comparison, we set the embedding dimension d = 128 for all methods. For the other parameters, we follow the original authors preferred choices. For Deep Walk and node2vec, windows size m=10, #walks per node r=80, walk length t=40. p, q in node2vec are searched over {0.25, 0.50, 1, 2, 4}. For LINE, #negative-samples k = 5 and total sampling budget T=r t |V |. For Gra Rep, the dimension of the concatenated embedding is d=128 for fairness. For HOPE, β is calculated in authors code and searched over (0, 1) for the best performance. For Pro NE, the term number of the Chebyshev expansion k is set to 10, µ=0.2, and θ=0.5, which are the default settings. |