reproducibilityindex.ai

Conditional Network Embeddings

Authors: Bo Kang, Jefrey Lijffijt, Tijl De Bie

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Section 3 reports on extensive experiments, comparing with state-of-the-art baselines on link prediction and multi-label classiﬁcation, on commonly used benchmark networks. These experiments show that CNE s link prediction accuracy is consistently superior. For multi-label classiﬁcation CNE is consistently best on the Macro-F1 score and best or second best on the Micro-F1 score. These results are achieved with considerably lower-dimensional embeddings than the baselines.
Researcher Affiliation	Academia	Bo Kang, Jefrey Lijfﬁjt & Tijl De Bie Department of Electronics and Information Systems (ELIS), IDLab Ghent University Ghent, Belgium {firstname.lastname}@ugent.be
Pseudocode	No	The paper describes the proposed block stochastic gradient descent algorithm in prose and mathematical equations in Section 2.2, but it does not provide a formal pseudocode block or algorithm listing.
Open Source Code	Yes	All code, including code for repeating the experiments, and links to the datasets are available at: https://bitbucket.org/ghentdatascience/cne.
Open Datasets	Yes	Table 1 lists the networks used in the experiments. Data Type #Nodes #Links #Labels Facebook (Leskovec & Krevl, 2015) Friendship 4,039 88,234 ar Xiv ASTRO-PH (Leskovec & Krevl, 2015) Co-authorship 18,722 198,110 Gowalla (Cho et al., 2011) Friendship 196,591 950,327 Student DB (Goethals et al., 2010) Relational/k-partite 403 3,429 Blog Catalog (Zafarani & Liu, 2009) Bloggers 10,312 333,983 39 Protein-Protein Int. (Breitkreutz et al., 2007) Biological 3,890 76,584 50 Wikipedia (Mahoney, 2011) Word co-occurrence 4,777 184,812 40
Dataset Splits	Yes	For node2vec, the hyperparameters p and q are tuned over a grid p, q {0.25, 0.05, 1, 2, 4} using 10-fold cross validation. We repeat our experiments for 10 times with different random seeds. The ﬁnal scores are averaged over the 10 repetitions.
Hardware Specification	Yes	This experiment is performed with single process/thread on a desktop with CPU 2,7 GHz Intel Core i5 and RAM 16 GB 1600 MHz DDR3.
Software Dependencies	No	The paper mentions using 'sklearn, Pedregosa et al., 2011)' for the logistic regression classiﬁer but does not specify a version number for scikit-learn or any other software dependencies with their versions.
Experiment Setup	Yes	For all methods we used their default parameter settings reported in the original papers and with d = 128. For node2vec, the hyperparameters p and q are tuned over a grid p, q {0.25, 0.05, 1, 2, 4} using 10-fold cross validation. [...] In this experiment we set d = 8 and k = 50. Only for the two largest networks (ar Xiv and Gowalla), we increase the dimensionality to d = 16 to reduce underfitting. [...] In our quantitative experiments we always set σ2 = 2. [...] For CNE, we set d = 8 (For ar Xiv k = 16 to reduce underfitting) and k = 50. We set stopping criterion of CNE \|\| X\|\| < 10 2 or max Iter < 250 (whichever is met ﬁrst).