Conditional Network Embeddings

Authors: Bo Kang, Jefrey Lijffijt, Tijl De Bie

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Section 3 reports on extensive experiments, comparing with state-of-the-art baselines on link prediction and multi-label classification, on commonly used benchmark networks. These experiments show that CNE s link prediction accuracy is consistently superior. For multi-label classification CNE is consistently best on the Macro-F1 score and best or second best on the Micro-F1 score. These results are achieved with considerably lower-dimensional embeddings than the baselines.
Researcher Affiliation Academia Bo Kang, Jefrey Lijffijt & Tijl De Bie Department of Electronics and Information Systems (ELIS), IDLab Ghent University Ghent, Belgium {firstname.lastname}@ugent.be
Pseudocode No The paper describes the proposed block stochastic gradient descent algorithm in prose and mathematical equations in Section 2.2, but it does not provide a formal pseudocode block or algorithm listing.
Open Source Code Yes All code, including code for repeating the experiments, and links to the datasets are available at: https://bitbucket.org/ghentdatascience/cne.
Open Datasets Yes Table 1 lists the networks used in the experiments. Data Type #Nodes #Links #Labels Facebook (Leskovec & Krevl, 2015) Friendship 4,039 88,234 ar Xiv ASTRO-PH (Leskovec & Krevl, 2015) Co-authorship 18,722 198,110 Gowalla (Cho et al., 2011) Friendship 196,591 950,327 Student DB (Goethals et al., 2010) Relational/k-partite 403 3,429 Blog Catalog (Zafarani & Liu, 2009) Bloggers 10,312 333,983 39 Protein-Protein Int. (Breitkreutz et al., 2007) Biological 3,890 76,584 50 Wikipedia (Mahoney, 2011) Word co-occurrence 4,777 184,812 40
Dataset Splits Yes For node2vec, the hyperparameters p and q are tuned over a grid p, q {0.25, 0.05, 1, 2, 4} using 10-fold cross validation. We repeat our experiments for 10 times with different random seeds. The final scores are averaged over the 10 repetitions.
Hardware Specification Yes This experiment is performed with single process/thread on a desktop with CPU 2,7 GHz Intel Core i5 and RAM 16 GB 1600 MHz DDR3.
Software Dependencies No The paper mentions using 'sklearn, Pedregosa et al., 2011)' for the logistic regression classifier but does not specify a version number for scikit-learn or any other software dependencies with their versions.
Experiment Setup Yes For all methods we used their default parameter settings reported in the original papers and with d = 128. For node2vec, the hyperparameters p and q are tuned over a grid p, q {0.25, 0.05, 1, 2, 4} using 10-fold cross validation. [...] In this experiment we set d = 8 and k = 50. Only for the two largest networks (ar Xiv and Gowalla), we increase the dimensionality to d = 16 to reduce underfitting. [...] In our quantitative experiments we always set σ2 = 2. [...] For CNE, we set d = 8 (For ar Xiv k = 16 to reduce underfitting) and k = 50. We set stopping criterion of CNE || X|| < 10 2 or max Iter < 250 (whichever is met first).