reproducibilityindex.ai

Scalable Graph Embedding for Asymmetric Proximity

Authors: Chang Zhou, Yuqiong Liu, Xiaofei Liu, Zhongyi Liu, Jun Gao

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on tasks of link prediction and node recommendation on open source datasets, as well as online recommendation services in Alibaba Group, in which the training graph has over 290 million vertices and 18 billion edges, showing our method to be highly scalable and effective.
Researcher Affiliation	Collaboration	Chang Zhou,1 Yuqiong Liu,1 Xiaofei Liu,2 Zhongyi Liu,2 Jun Gao1 1Key Laboratory of High Conﬁdence Software Technologies, EECS, Peking University {zhouchang,liuyuqiong,gaojun}@pku.edu.cn Corresponding Authors 2Alibaba Group, {hsiaofei.hﬂ,zhongyi.lzy}@alibaba-inc.com
Pseudocode	Yes	Algorithm 1 APP Embedding Algorithm
Open Source Code	No	The paper does not provide an explicit statement or link indicating that the source code for their method is open-source or publicly available.
Open Datasets	Yes	We collect four open-source data graphs to compare the performance of the graph embedding methods. Arxiv 1: Arxiv GR-QC is a collaboration network generated from the e-print ar Xiv. Nodes represent authors of papers and edges represent collaborations between authors. Cora 2: It is a citation network of academic papers where nodes represent academic papers and each directed edge indicates the citation relationship between papers. Epinions 3: This is the trust network from the online social network Epinions. Nodes are users of Epinions and directed edges represent trust between the users. Amazon 4: In Amazon network, nodes represent products and edges represent co-purchasing relation between products. We convert the original Amazon graph to an undirected graph, since the co-purchasing relation is symmetric. We also evaluate our method on an extremely large private data Ali Item Graph, which is an item graph converted by the item click sequence from the user browsing sessions. Ali Item Graph has 290 million vertices (items) and over 18 billion edges. We learn the source and target embedded vectors for each item and use them for an online recommendation service in Alibaba Group, which we will describe in more detail later. Some statistics about these graphs are summarized in Table 1.
Dataset Splits	No	The paper specifies training and test set splits (e.g., 'we remove 30% of edges... as ground truth in the test set, and take the remaining graph as the training set' and 'We remove 10% of the edges in the original graph as test set, and use the rest of the graph as the training data.'), but does not explicitly mention a separate validation set or provide details for a validation split.
Hardware Specification	No	The paper mentions evaluating the method on 'online recommendation services in Alibaba Group' and on a large 'Ali Item Graph' with '290 million vertices and 18 billion edges', implying substantial computational resources. However, it does not provide specific hardware details such as GPU or CPU models, or memory specifications used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependency details such as library names with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	Yes	Input: G(V, E, W), Jumping Factor α, Learning Rate η Output: Embedded Vector of sv, tv for each v V (from Algorithm 1) and 'we randomly samples k negative pairs according to a vertex distribution of PD(n).' and 'the number of dimensions is set to 128. In fact, we observe no signiﬁcant improvement as the dimension size of our method goes beyond 32 for small datasets in both tasks.'