Scalable Graph Embedding for Asymmetric Proximity
Authors: Chang Zhou, Yuqiong Liu, Xiaofei Liu, Zhongyi Liu, Jun Gao
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on tasks of link prediction and node recommendation on open source datasets, as well as online recommendation services in Alibaba Group, in which the training graph has over 290 million vertices and 18 billion edges, showing our method to be highly scalable and effective. |
| Researcher Affiliation | Collaboration | Chang Zhou,1 Yuqiong Liu,1 Xiaofei Liu,2 Zhongyi Liu,2 Jun Gao1 1Key Laboratory of High Confidence Software Technologies, EECS, Peking University {zhouchang,liuyuqiong,gaojun}@pku.edu.cn Corresponding Authors 2Alibaba Group, {hsiaofei.hfl,zhongyi.lzy}@alibaba-inc.com |
| Pseudocode | Yes | Algorithm 1 APP Embedding Algorithm |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for their method is open-source or publicly available. |
| Open Datasets | Yes | We collect four open-source data graphs to compare the performance of the graph embedding methods. Arxiv 1: Arxiv GR-QC is a collaboration network generated from the e-print ar Xiv. Nodes represent authors of papers and edges represent collaborations between authors. Cora 2: It is a citation network of academic papers where nodes represent academic papers and each directed edge indicates the citation relationship between papers. Epinions 3: This is the trust network from the online social network Epinions. Nodes are users of Epinions and directed edges represent trust between the users. Amazon 4: In Amazon network, nodes represent products and edges represent co-purchasing relation between products. We convert the original Amazon graph to an undirected graph, since the co-purchasing relation is symmetric. We also evaluate our method on an extremely large private data Ali Item Graph, which is an item graph converted by the item click sequence from the user browsing sessions. Ali Item Graph has 290 million vertices (items) and over 18 billion edges. We learn the source and target embedded vectors for each item and use them for an online recommendation service in Alibaba Group, which we will describe in more detail later. Some statistics about these graphs are summarized in Table 1. |
| Dataset Splits | No | The paper specifies training and test set splits (e.g., 'we remove 30% of edges... as ground truth in the test set, and take the remaining graph as the training set' and 'We remove 10% of the edges in the original graph as test set, and use the rest of the graph as the training data.'), but does not explicitly mention a separate validation set or provide details for a validation split. |
| Hardware Specification | No | The paper mentions evaluating the method on 'online recommendation services in Alibaba Group' and on a large 'Ali Item Graph' with '290 million vertices and 18 billion edges', implying substantial computational resources. However, it does not provide specific hardware details such as GPU or CPU models, or memory specifications used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details such as library names with version numbers (e.g., Python, TensorFlow, PyTorch versions). |
| Experiment Setup | Yes | Input: G(V, E, W), Jumping Factor α, Learning Rate η Output: Embedded Vector of sv, tv for each v V (from Algorithm 1) and 'we randomly samples k negative pairs according to a vertex distribution of PD(n).' and 'the number of dimensions is set to 128. In fact, we observe no significant improvement as the dimension size of our method goes beyond 32 for small datasets in both tasks.' |