CAT-Walk: Inductive Hypergraph Learning via Set Walks

Authors: Ali Behrouz, Farnoosh Hashemi, Sadaf Sadeghian, Margo Seltzer

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluation on 10 hypergraph benchmark datasets shows that CAt-Walk attains outstanding performance on temporal hyperedge prediction benchmarks in both inductive and transductive settings. It also shows competitive performance with state-of-the-art methods for node classification.
Researcher Affiliation Academia Ali Behrouz Department of Computer Science University of British Columbia alibez@cs.ubc.ca
Pseudocode Yes The pseudocode of our Set Walk sampling algorithm and its complexity analysis are in Appendix D.
Open Source Code Yes (Code)
Open Datasets Yes We use 10 available benchmark datasets [6] from the existing hypergraph neural networks literature. These datasets domains include drug networks (i.e., NDC [6]), contact networks (i.e., High School [92] and Primary School [93]), the US. Congress bills network [94, 95], email networks (i.e., Email Enron [6] and Email Eu [96]), and online social networks (i.e., Question Tags and Users-Threads [6]). Detailed descriptions of these datasets appear in Appendix F.1.
Dataset Splits Yes In the transductive setting, we train on the temporal hyperedges with timestamps less than or equal to Ttrain and test on those with timestamps greater than Ttrain. Inspired by Wang et al. [33], we consider two inductive settings. In the Strongly Inductive setting, we predict hyperedges consisting of some unseen nodes. In the Weakly Inductive setting,we predict hyperedges with at least one seen and some unseen nodes. We first follow the procedure used in the transductive setting, and then we randomly select 10% of the nodes and remove all hyperedges that include them from the training set. We then remove all hyperedges with seen nodes from the validation and testing sets. For dynamic node classification, see Appendix G.2. For all datasets, we fix Ttrain = 0.7 T, where T is the last timestamp.
Hardware Specification Yes We implemented our method in Python 3.7 with Py Torch and run the experiments on a Linux machine with nvidia RTX A4000 GPU with 16GB of RAM.
Software Dependencies Yes We implemented our method in Python 3.7 with Py Torch and run the experiments on a Linux machine with nvidia RTX A4000 GPU with 16GB of RAM.
Experiment Setup Yes On all datasets, we use a batch size of 64 and set learning rate = 10 4. We also use an early stopping strategy to stop training if the validation performance does not increase for more than 5 epochs. We use the maximum training epoch number of 30 and dropout layers with rate = 0.1. Other hyperparameters used in the implementation can be found in the README file in the supplement. Also, for tuning the model s hyperparameters, we systematically tune them using grid search. The search domains of each hyperparameter are reported in Table 4.