Learning Network Embedding with Community Structural Information

Authors: Yu Li, Ying Wang, Tingting Zhang, Jiawei Zhang, Yi Chang

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on several benchmark network datasets demonstrate the effectiveness of the proposed framework in various network analysis tasks including network reconstruction, link prediction and vertex classification.
Researcher Affiliation Academia Yu Li1,5 , Ying Wang1,5 , Tingting Zhang2 , Jiawei Zhang3 and Yi Chang4 1College of Computer Science and Technology, Jilin University, Changchun, China 2School of Statistics, Jilin University of Finance and Economics, Changchun, China 3IFM Lab, Department of Computer Science, Florida State University, Tallahassee FL, USA 4School of Artificial Intelligence, Jilin University, Changchun, China 5Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China liyu18@mails.jlu.edu.cn, wangying2010@jlu.edu.cn, 103069@jlufe.edu.cn, jiawei@ifmlab.org, yichang@acm.org
Pseudocode No The paper describes the alternating optimization algorithm mathematically but does not provide a formal pseudocode block or algorithm listing.
Open Source Code No The paper does not provide a link or an explicit statement about releasing the source code for the methodology described.
Open Datasets Yes We conduct experiments on the following eleven network datasets, including Web KB(Cornell, Texas, Washington, Wisconsin)1 [Wang et al., 2017], Citeseer1 [Mc Callum et al., 2000], Cora1 [Mc Callum et al., 2000], Polbooks2, Football2, Polblogs2 [Adamic and Glance, 2005], Wiki3 and Email4. All statistics of the datasets are summarized in Table 1.
Dataset Splits No The paper mentions splitting data into training and test sets (e.g., "randomly remove 10% edges as test set for evaluation and utilize the remaining edges to learn the vertex representations" or "randomly sample 80% vertices as training set and the remaining vertices as test set"), but it does not specify a distinct validation set or its split percentage for hyperparameter tuning.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU/CPU models, memory, or cloud instance types.
Software Dependencies No The paper mentions using LIBLINEAR for logistic regression and lists various baseline methods (Deepwalk, LINE, Node2Vec, etc.) but does not provide specific version numbers for any key software dependencies or libraries required for replication (e.g., Python, PyTorch/TensorFlow, scikit-learn versions).
Experiment Setup Yes For Deep Walk, we set the windowsize as 5, the number of walks as 10 and the walk-length as 40. For LINE, we set the number of negative samples as 5, the total number of training samples as 10 million and the starting learning rate as 0.025. For Node2Vec, we set the number of walks as 10, the walk-length as 80, p as 1, q as 1 and the window size as 5. For Gra Rep, we set the maximum matrix transition as 4. For SDNE, we set α as 100, β as 10 and batch size as 16. For M-NMF, we tune α and β from {0.1, 0.5, 1, 5, 10} to get the best performance. In NECS, we vary the order l from {1, 2, 3} and α, β from {0.1, 0.2, 0.5, 1, 2, 5, 10}. To shrink the searching space for hyper-parameters of high-order proximity, we simply fix the weights as wi = µi 1 and vary µ from {0.1, 0.2, 0.3}.