Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
GraphFM: Improving Large-Scale GNN Training via Feature Momentum
Authors: Haiyang Yu, Limei Wang, Bokun Wang, Meng Liu, Tianbao Yang, Shuiwang Ji
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we observe that Graph FM-IB can effectively alleviate the neighborhood explosion problem of existing methods. In addition, Graph FM-OB achieves promising performance on multiple large-scale graph datasets. |
| Researcher Affiliation | Academia | 1Department of Computer Science & Engineering, Texas A&M University, TX, USA 2Department of Computer Science, The University of Iowa, IA, USA. |
| Pseudocode | Yes | Algorithm 1 Graph FM-IB and Algorithm 2 Graph FM-OB are provided in the paper. |
| Open Source Code | Yes | Our code is implemented in the DIG (Dive into Graphs) library (Liu et al., 2021), which is a turnkey library for graph deep learning research and publicly available1. 1https://github.com/divelab/DIG/tree/dig/dig/lsgraph |
| Open Datasets | Yes | We evaluate our proposed algorithms Graph FM-IB and Graph FM-OB with extensive experiments on the node classification task on five large-scale graphs, including Flickr (Zeng et al., 2019), Yelp (Zeng et al., 2019), Reddit (Hamilton et al., 2017), ogbn-arxiv (Hu et al., 2021) and ogbn-products (Hu et al., 2021). |
| Dataset Splits | Yes | Table 1: Statistics and properties of the datasets. The m denotes the multi-label classification task, and s denotes single label classification task. Dataset # of nodes # of edges Avg. degree # of features # of classes Train/Val/Test ... Flickr ... 0.500/0.250/0.250 |
| Hardware Specification | Yes | In addition, we conduct our experiments on Nvidia Ge Force RTX 2080 with 11GB memory, and Intel Xeon Gold 6248 CPU. |
| Software Dependencies | No | The paper states: 'The implementation of our methods is based on the Py Torch (Paszke et al., 2019), and Pytorch_geometric (Fey & Lenssen, 2019).' While PyTorch has a year cited, specific version numbers for both key software components (PyTorch and Pytorch_geometric) are not provided in the text. |
| Experiment Setup | Yes | We explore the feature momentum hyper-parameter β in the range from 0.1 to 0.9. We select the learning rate from {0.01, 0.05, 0.001} and dropout from {0.0, 0.1, 0.3, 0.5}. Due to the over-fitting problem on the ogbn-products dataset, we set the edge drop (Rong et al., 2019) ratio at 0.8 during training for this particular dataset. |