Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Conditional Network Embeddings
Authors: Bo Kang, Jefrey Lijffijt, Tijl De Bie
ICLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Section 3 reports on extensive experiments, comparing with state-of-the-art baselines on link prediction and multi-label classification, on commonly used benchmark networks. These experiments show that CNE s link prediction accuracy is consistently superior. For multi-label classification CNE is consistently best on the Macro-F1 score and best or second best on the Micro-F1 score. These results are achieved with considerably lower-dimensional embeddings than the baselines. |
| Researcher Affiliation | Academia | Bo Kang, Jefrey Lijffijt & Tijl De Bie Department of Electronics and Information Systems (ELIS), IDLab Ghent University Ghent, Belgium {firstname.lastname}@ugent.be |
| Pseudocode | No | The paper describes the proposed block stochastic gradient descent algorithm in prose and mathematical equations in Section 2.2, but it does not provide a formal pseudocode block or algorithm listing. |
| Open Source Code | Yes | All code, including code for repeating the experiments, and links to the datasets are available at: https://bitbucket.org/ghentdatascience/cne. |
| Open Datasets | Yes | Table 1 lists the networks used in the experiments. Data Type #Nodes #Links #Labels Facebook (Leskovec & Krevl, 2015) Friendship 4,039 88,234 ar Xiv ASTRO-PH (Leskovec & Krevl, 2015) Co-authorship 18,722 198,110 Gowalla (Cho et al., 2011) Friendship 196,591 950,327 Student DB (Goethals et al., 2010) Relational/k-partite 403 3,429 Blog Catalog (Zafarani & Liu, 2009) Bloggers 10,312 333,983 39 Protein-Protein Int. (Breitkreutz et al., 2007) Biological 3,890 76,584 50 Wikipedia (Mahoney, 2011) Word co-occurrence 4,777 184,812 40 |
| Dataset Splits | Yes | For node2vec, the hyperparameters p and q are tuned over a grid p, q {0.25, 0.05, 1, 2, 4} using 10-fold cross validation. We repeat our experiments for 10 times with different random seeds. The final scores are averaged over the 10 repetitions. |
| Hardware Specification | Yes | This experiment is performed with single process/thread on a desktop with CPU 2,7 GHz Intel Core i5 and RAM 16 GB 1600 MHz DDR3. |
| Software Dependencies | No | The paper mentions using 'sklearn, Pedregosa et al., 2011)' for the logistic regression classifier but does not specify a version number for scikit-learn or any other software dependencies with their versions. |
| Experiment Setup | Yes | For all methods we used their default parameter settings reported in the original papers and with d = 128. For node2vec, the hyperparameters p and q are tuned over a grid p, q {0.25, 0.05, 1, 2, 4} using 10-fold cross validation. [...] In this experiment we set d = 8 and k = 50. Only for the two largest networks (ar Xiv and Gowalla), we increase the dimensionality to d = 16 to reduce underfitting. [...] In our quantitative experiments we always set σ2 = 2. [...] For CNE, we set d = 8 (For ar Xiv k = 16 to reduce underfitting) and k = 50. We set stopping criterion of CNE || X|| < 10 2 or max Iter < 250 (whichever is met first). |