reproducibilityindex.ai

Neural Approximation of Graph Topological Features

Authors: Zuoyu Yan, Tengfei Ma, Liangcai Gao, Zhi Tang, Yusu Wang, Chao Chen

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we thoroughly evaluate the proposed model from 3 different perspectives. In Section 5.1, we evaluate the approximation error between the predicted diagram and the original diagram and show that the prediction is very close to the ground truth. Even with a small approximation error, we still need to know how much does the error influence downstream tasks. Therefore, in Section 5.2, we evaluate the learning power of the predicted diagrams through 2 downstream graph representation learning tasks: node classification and link prediction. We observe that the model using the predicted diagrams performs comparably with the model using the ground truth diagrams. In Section 5.3, we evaluate the efficiency of the proposed algorithm. Experiments demonstrate that the proposed method is much faster than the original algorithm, especially on large and dense graphs.
Researcher Affiliation	Collaboration	Zuoyu Yan Wangxuan Institute of Computer Technology Peking University yanzuoyu3@pku.edu.cn Tengfei Ma IBM T. J. Watson Research Center tengfei.ma1@ibm.com Liangcai Gao Wangxuan Institute of Computer Technology Peking University glc@pku.edu.cn Zhi Tang Wangxuan Institute of Computer Technology Peking University tangzhi@pku.edu.cn Yusu Wang Halıcıo glu Data Science Institute University of California yusuwang@ucsd.edu Chao Chen Department of Biomedical Informatics Stony Brook University chao.chen.1@stonybrook.edu
Pseudocode	Yes	Algorithm 1 Sequential algorithm, Algorithm 2 Computation of EPD, Algorithm 3 Union-Find-step (Sequential)
Open Source Code	Yes	Source code is available at https://github.com/pkuyzy/TLC-GNN.
Open Datasets	Yes	Datasets. ... The input graphs include (1) citation networks including Cora, Citeseer, and Pub Med [36]; (2) Amazon shopping datasets including Photo and Computers [37]; (3) coauthor datasets including CS and Physics [37].
Dataset Splits	Yes	Given an input graph (e.g., Cora, Citeseer, etc.) and a filter function, we extract the k-hop neighborhoods of all the vertices and separate these vicinity graphs randomly into 80%/20% as training/test sets. ... For node classification, our setting is the same as [22, 42, 56]. To be specific, we train the GNNs with 20 nodes from each class and validate (resp. test) the GNN on 500 (resp. 1000) nodes. ... For link prediction, our setting is the same as [6, 50]. To be precise, we randomly split existing edges into 85/5/10% for training, validation, and test sets.
Hardware Specification	Yes	Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] It is provided in the supplementary material.
Software Dependencies	No	The paper mentions software like GNNs, GIN, GAT, and external tools like Gudhi, but does not provide specific version numbers for these or other key software dependencies in the main text.
Experiment Setup	Yes	For node classification, our setting is the same as [22, 42, 56]. To be specific, we train the GNNs with 20 nodes from each class and validate (resp. test) the GNN on 500 (resp. 1000) nodes. ... Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] It is illustrated in Section 5 in the paper and in the supplementary material.