reproducibilityindex.ai

Asymptotics of $\ell_2$ Regularized Network Embeddings

Authors: Andrew Davison

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now examine the performance in using regularized node2vec embeddings for link prediction and node classification tasks, and illustrate comparable, when not superior, performance to more complicated encoders for network embeddings. We perform experiments on the Cora, Cite Seer and Pub Med Diabetes citation network datasets
Researcher Affiliation	Academia	Andrew Davison Department of Statistics Columbia University New York, NY 10027 ad3395@columbia.edu
Pseudocode	Yes	Algorithm 1 (Uniform vertex sampling). Given a graph Gn and number of samples k, we select k vertices from Gn uniformly and without replacement, and then return S(Gn) as the induced subgraph using these sampled vertices.
Open Source Code	Yes	The code used for the experiments can be found at https://github.com/AndrewDavidson21/regularized_node_embeddings.
Open Datasets	Yes	We perform experiments on the Cora, Cite Seer and Pub Med Diabetes citation network datasets (see Appendix G for more details), which we use as they are commonly used benchmark datasets see e.g [26, 28, 34, 64]. ... All are publicly available through the StellarGraph library [20].
Dataset Splits	Yes	For the link prediction experiments, we create a training graph by removing 10% of both the edges and non-edges within the network, and use this to learn an embedding of the network. We then form link embeddings by taking the entry-wise product of the corresponding node embeddings, use 10% of the held-out edges to build a logistic classifier for the link categories, and then evaluate the performance on the remaining edges, repeating this process 50 times. ... To evaluate performance for the node classification task, we learn a network embedding without access to the node labels, and then learn/evaluate a one-versus-rest multinomial node classifier using 5%/95% stratified training/test splits of the node labels.
Hardware Specification	Yes	All experiments used a single NVIDIA GeForce RTX 3090 GPU, with Python 3.8.12, PyTorch 1.10.1 and CUDA 11.3.
Software Dependencies	Yes	All experiments used a single NVIDIA GeForce RTX 3090 GPU, with Python 3.8.12, PyTorch 1.10.1 and CUDA 11.3.
Experiment Setup	Yes	For node2vec, we use the default parameters as given in [25] (return_weight = 1, in_out_weight = 1, walk_length = 80, num_walks = 10, workers = 1, batch_size = 1) and embedding dimension of 128. We train all node2vec models for 50 epochs.