reproducibilityindex.ai

GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph

Authors: Junhan Yang, Zheng Liu, Shitao Xiao, Chaozhuo Li, Defu Lian, Sanjay Agrawal, Amit Singh, Guangzhong Sun, Xing Xie

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluations are conducted on three large-scale benchmark datasets, where Graph Formers outperform the SOTA baselines with comparable running efﬁciency.
Researcher Affiliation	Collaboration	o University of Science and Technology of China, Hefei, China p Microsoft Research Asia, Beijing, China m Beijing University of Posts and Telecommunications, Beijing, China n Microsoft India Development Center, Bengaluru, India
Pseudocode	Yes	Algorithm 1: Graph Formers Workﬂow
Open Source Code	Yes	The source code is released at https://github.com/microsoft/GraphFormers.
Open Datasets	Yes	DBLP5, which contains the paper citation graph from DBLP up to 2020-04-09. The paper s title is used as the textual feature. Wikidata5M6 (Wiki) (Wang et al., 2019b), which contains the entity graph from Wikipedia. The ﬁrst sentence in each entity s introduction is taken as its textual feature.
Dataset Splits	Yes	Table 1: Speciﬁcations of the experimental datasets: the number of items, the number of neighbour nodes on average, and the number of training, validation, testing cases. ... Product ... #Train 22,146,934 #Valid 30,000 #Test 306,742
Hardware Specification	Yes	The evaluation is made with a Nvidia P100 GPU.
Software Dependencies	No	The paper mentions software components like 'Uni LM-base', 'BERT-like PLM', and 'Word Piece' but does not specify their version numbers or the versions of general frameworks like PyTorch or TensorFlow.
Experiment Setup	Yes	In our experiment, each text is associated with 5 uniformly sampled neighbours (without replacement); for texts with neighbourhood smaller than 5, all the neighbours will be utilized. ... We use the common MLM strategy, where 15% of the input tokens are masked: 80% of them are replaced by [MASK], the rest ones are replaced randomly or kept as the original tokens with the same probabilities. ... Each mini-batch contains 32 encoding instances; each instance contains one center and #N neighbour nodes; the token length of each node is 16.