Asymptotics of $\ell_2$ Regularized Network Embeddings
Authors: Andrew Davison
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now examine the performance in using regularized node2vec embeddings for link prediction and node classification tasks, and illustrate comparable, when not superior, performance to more complicated encoders for network embeddings. We perform experiments on the Cora, Cite Seer and Pub Med Diabetes citation network datasets |
| Researcher Affiliation | Academia | Andrew Davison Department of Statistics Columbia University New York, NY 10027 ad3395@columbia.edu |
| Pseudocode | Yes | Algorithm 1 (Uniform vertex sampling). Given a graph Gn and number of samples k, we select k vertices from Gn uniformly and without replacement, and then return S(Gn) as the induced subgraph using these sampled vertices. |
| Open Source Code | Yes | The code used for the experiments can be found at https://github.com/AndrewDavidson21/regularized_node_embeddings. |
| Open Datasets | Yes | We perform experiments on the Cora, Cite Seer and Pub Med Diabetes citation network datasets (see Appendix G for more details), which we use as they are commonly used benchmark datasets see e.g [26, 28, 34, 64]. ... All are publicly available through the StellarGraph library [20]. |
| Dataset Splits | Yes | For the link prediction experiments, we create a training graph by removing 10% of both the edges and non-edges within the network, and use this to learn an embedding of the network. We then form link embeddings by taking the entry-wise product of the corresponding node embeddings, use 10% of the held-out edges to build a logistic classifier for the link categories, and then evaluate the performance on the remaining edges, repeating this process 50 times. ... To evaluate performance for the node classification task, we learn a network embedding without access to the node labels, and then learn/evaluate a one-versus-rest multinomial node classifier using 5%/95% stratified training/test splits of the node labels. |
| Hardware Specification | Yes | All experiments used a single NVIDIA GeForce RTX 3090 GPU, with Python 3.8.12, PyTorch 1.10.1 and CUDA 11.3. |
| Software Dependencies | Yes | All experiments used a single NVIDIA GeForce RTX 3090 GPU, with Python 3.8.12, PyTorch 1.10.1 and CUDA 11.3. |
| Experiment Setup | Yes | For node2vec, we use the default parameters as given in [25] (return_weight = 1, in_out_weight = 1, walk_length = 80, num_walks = 10, workers = 1, batch_size = 1) and embedding dimension of 128. We train all node2vec models for 50 epochs. |