reproducibilityindex.ai

Are More Layers Beneficial to Graph Transformers?

Authors: Haiteng Zhao, Shuming Ma, Dongdong Zhang, Zhi-Hong Deng, Furu Wei

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our method unblocks the depth limitation of graph transformers and results in state-of-the-art performance across various graph benchmarks with deeper models.
Researcher Affiliation	Collaboration	Haiteng Zhao1 , Shuming Ma2, Dongdong Zhang2, Zhi-Hong Deng1 , Furu Wei2 1 Peking University 2 Microsoft Research
Pseudocode	No	The paper describes methods in text and mathematical equations but does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Codes are available at https://github.com/zhao-ht/Deep Graph.
Open Datasets	Yes	Our method is validated on the tasks of the graph property prediction and node classification, specifically including PCQM4M-LSC (Hu et al., 2020), ZINC (Dwivedi et al., 2020), CLUSTER (Dwivedi et al., 2020) and PATTERN (Dwivedi et al., 2020), widely used in graph transformer studies.
Dataset Splits	No	The paper mentions using standard datasets but does not explicitly provide details about the train, validation, or test splits (e.g., percentages, sample counts, or explicit references to predefined splits).
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions using the "Adam optimizer" and the "Python package graph-tool" but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We implement Deep Graph with 12, 24, and 48 layers. The hidden dimension is 80 for ZINC and PATTERN, 48 for CLUSTER, and 768 for PCQM4M-LSC. The training uses Adam optimizer, with warm-up and decaying learning rates. Reported results are the average of over 4 seeds.