reproducibilityindex.ai

Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Authors: Derek Lim, Felix Hohne, Xiuyu Li, Sijia Linda Huang, Vaishnavi Gupta, Omkar Bhalerao, Ser Nam Lim

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results with representative simple methods and GNNs across our proposed datasets show that LINKX achieves state-of-the-art performance for learning on non-homophilous graphs.
Researcher Affiliation	Collaboration	Derek Lim Cornell University dl772@cornell.edu Felix Hohne Cornell University fmh42@cornell.edu Xiuyu Li Cornell University xl289@cornell.edu Sijia Linda Huang Cornell University sh837@cornell.edu Vaishnavi Gupta Cornell University vg222@cornell.edu Omkar Bhalerao Cornell University opb7@cornell.edu Ser-Nam Lim Facebook AI sernam@gmail.com
Pseudocode	No	The paper describes the LINKX model with a diagram, but it does not include a formal pseudocode or algorithm block.
Open Source Code	Yes	Our codes and data are available at https://github.com/CUAI/Non-Homophily-Large-Scale.
Open Datasets	Yes	Our codes and data are available at https://github.com/CUAI/Non-Homophily-Large-Scale. Here, we detail the non-homophilous datasets that we propose for graph machine learning evaluation. Our datasets and tasks span diverse application areas. Penn94 [67], Pokec [41], genius [43], and twitch-gamers [60] are online social networks, where the task is to predict reported gender, certain account labels, or use of explicit content on user accounts. For the citation networks ar Xiv-year [31] and snap-patents [42, 41] the goal is to predict year of paper publication or the year that a patent is granted.
Dataset Splits	Yes	We run each method on the same ﬁve random 50/25/25 train/val/test splits for each dataset.
Hardware Specification	Yes	This is especially important on the scale of the wiki dataset, where none of our tested methods other than MLP is capable of running on a Titan RTX GPU with 24 GB GPU RAM (see Section 5).
Software Dependencies	No	We implement our models using PyTorch [56] and PyTorch Geometric [23]. We use the Optuna framework [2] for hyperparameter optimization and WandB [40] for logging. No specific version numbers for these software dependencies are provided.
Experiment Setup	Yes	All methods requiring gradient-based optimization are run for 500 epochs, with test performance reported for the learned parameters of highest validation performance... All methods are trained for 500 epochs with Adam optimizer [37] using a learning rate of 0.01 and a weight decay of 0.0005 (unless otherwise speciﬁed). Dropout [65] with p = 0.5 and ELU [19] activations are used for all hidden layers of all MLPs and GNNs.