Conditional Network Embeddings
Authors: Bo Kang, Jefrey Lijffijt, Tijl De Bie
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Section 3 reports on extensive experiments, comparing with state-of-the-art baselines on link prediction and multi-label classification, on commonly used benchmark networks. These experiments show that CNE s link prediction accuracy is consistently superior. For multi-label classification CNE is consistently best on the Macro-F1 score and best or second best on the Micro-F1 score. These results are achieved with considerably lower-dimensional embeddings than the baselines. |
| Researcher Affiliation | Academia | Bo Kang, Jefrey Lijffijt & Tijl De Bie Department of Electronics and Information Systems (ELIS), IDLab Ghent University Ghent, Belgium {firstname.lastname}@ugent.be |
| Pseudocode | No | The paper describes the proposed block stochastic gradient descent algorithm in prose and mathematical equations in Section 2.2, but it does not provide a formal pseudocode block or algorithm listing. |
| Open Source Code | Yes | All code, including code for repeating the experiments, and links to the datasets are available at: https://bitbucket.org/ghentdatascience/cne. |
| Open Datasets | Yes | Table 1 lists the networks used in the experiments. Data Type #Nodes #Links #Labels Facebook (Leskovec & Krevl, 2015) Friendship 4,039 88,234 ar Xiv ASTRO-PH (Leskovec & Krevl, 2015) Co-authorship 18,722 198,110 Gowalla (Cho et al., 2011) Friendship 196,591 950,327 Student DB (Goethals et al., 2010) Relational/k-partite 403 3,429 Blog Catalog (Zafarani & Liu, 2009) Bloggers 10,312 333,983 39 Protein-Protein Int. (Breitkreutz et al., 2007) Biological 3,890 76,584 50 Wikipedia (Mahoney, 2011) Word co-occurrence 4,777 184,812 40 |
| Dataset Splits | Yes | For node2vec, the hyperparameters p and q are tuned over a grid p, q {0.25, 0.05, 1, 2, 4} using 10-fold cross validation. We repeat our experiments for 10 times with different random seeds. The final scores are averaged over the 10 repetitions. |
| Hardware Specification | Yes | This experiment is performed with single process/thread on a desktop with CPU 2,7 GHz Intel Core i5 and RAM 16 GB 1600 MHz DDR3. |
| Software Dependencies | No | The paper mentions using 'sklearn, Pedregosa et al., 2011)' for the logistic regression classifier but does not specify a version number for scikit-learn or any other software dependencies with their versions. |
| Experiment Setup | Yes | For all methods we used their default parameter settings reported in the original papers and with d = 128. For node2vec, the hyperparameters p and q are tuned over a grid p, q {0.25, 0.05, 1, 2, 4} using 10-fold cross validation. [...] In this experiment we set d = 8 and k = 50. Only for the two largest networks (ar Xiv and Gowalla), we increase the dimensionality to d = 16 to reduce underfitting. [...] In our quantitative experiments we always set σ2 = 2. [...] For CNE, we set d = 8 (For ar Xiv k = 16 to reduce underfitting) and k = 50. We set stopping criterion of CNE || X|| < 10 2 or max Iter < 250 (whichever is met first). |