Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods
Authors: Derek Lim, Felix Hohne, Xiuyu Li, Sijia Linda Huang, Vaishnavi Gupta, Omkar Bhalerao, Ser Nam Lim
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results with representative simple methods and GNNs across our proposed datasets show that LINKX achieves state-of-the-art performance for learning on non-homophilous graphs. |
| Researcher Affiliation | Collaboration | Derek Lim Cornell University EMAIL Felix Hohne Cornell University EMAIL Xiuyu Li Cornell University EMAIL Sijia Linda Huang Cornell University EMAIL Vaishnavi Gupta Cornell University EMAIL Omkar Bhalerao Cornell University EMAIL Ser-Nam Lim Facebook AI EMAIL |
| Pseudocode | No | The paper describes the LINKX model with a diagram, but it does not include a formal pseudocode or algorithm block. |
| Open Source Code | Yes | Our codes and data are available at https://github.com/CUAI/Non-Homophily-Large-Scale. |
| Open Datasets | Yes | Our codes and data are available at https://github.com/CUAI/Non-Homophily-Large-Scale. Here, we detail the non-homophilous datasets that we propose for graph machine learning evaluation. Our datasets and tasks span diverse application areas. Penn94 [67], Pokec [41], genius [43], and twitch-gamers [60] are online social networks, where the task is to predict reported gender, certain account labels, or use of explicit content on user accounts. For the citation networks ar Xiv-year [31] and snap-patents [42, 41] the goal is to predict year of paper publication or the year that a patent is granted. |
| Dataset Splits | Yes | We run each method on the same five random 50/25/25 train/val/test splits for each dataset. |
| Hardware Specification | Yes | This is especially important on the scale of the wiki dataset, where none of our tested methods other than MLP is capable of running on a Titan RTX GPU with 24 GB GPU RAM (see Section 5). |
| Software Dependencies | No | We implement our models using PyTorch [56] and PyTorch Geometric [23]. We use the Optuna framework [2] for hyperparameter optimization and WandB [40] for logging. No specific version numbers for these software dependencies are provided. |
| Experiment Setup | Yes | All methods requiring gradient-based optimization are run for 500 epochs, with test performance reported for the learned parameters of highest validation performance... All methods are trained for 500 epochs with Adam optimizer [37] using a learning rate of 0.01 and a weight decay of 0.0005 (unless otherwise specified). Dropout [65] with p = 0.5 and ELU [19] activations are used for all hidden layers of all MLPs and GNNs. |