Efficient Subgraph GNNs by Learning Effective Selection Policies

Authors: Beatrice Bevilacqua, Moshe Eliasof, Eli Meirom, Bruno Ribeiro, Haggai Maron

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results demonstrate that POLICY-LEARN outperforms existing baselines across a wide range of datasets.
Researcher Affiliation Collaboration Beatrice Bevilacqua Purdue University bbevilac@purdue.edu Moshe Eliasof University of Cambridge me532@cam.ac.uk Eli Meirom NVIDIA Research emeirom@nvidia.com Bruno Ribeiro Purdue University ribeiro@cs.purdue.edu Haggai Maron Technion & NVIDIA Research hmaron@nvidia.com
Pseudocode Yes Our method is illustrated in Figure 3 and described in Algorithm 1. Algorithm 1 POLICY-LEARN: Feedforward with learnable subgraph selection policy
Open Source Code Yes Our code is available at https://github.com/beabevi/policy-learn
Open Datasets Yes We experimented with the ZINC-12K molecular dataset (Sterling & Irwin, 2015; Gómez-Bombarelli et al., 2018; Dwivedi et al., 2020), where as prescribed we maintain a 500k parameter budget. We tested our framework on several datasets from the OGB benchmark collection (Hu et al., 2020). To showcase the capabilities of POLICY-LEARN on large graphs, we experimented with the REDDIT-BINARY dataset (Morris et al., 2020a).
Dataset Splits Yes We used the evaluation procedure proposed in Xu et al. (2019), consisting of 10-fold cross validation and metric at the best averaged validation accuracy across the folds. We considered the challenging scaffold splits proposed in Hu et al. (2020), and for each dataset we used the loss and evaluation metric prescribed therein. We considered the dataset splits proposed in Dwivedi et al. (2020).
Hardware Specification Yes We ran our experiments on NVIDIA DGX V100, Ge Force 2080, NVIDIA RTX A5000, NVIDIA RTX A6000, NVIDIA Ge Force RTX 4090 and TITAN V GPUs.
Software Dependencies No We implemented POLICY-LEARN using Pytorch (Paszke et al., 2019) and Pytorch Geometric (Fey & Lenssen, 2019). The paper mentions the software packages used but does not provide specific version numbers for them.
Experiment Setup Yes For all models (FULL, RANDOM, POLICY-LEARN), we used Adam optimizer with initial learning rate of 0.001. We set the batch size to 128, except for the FULL method on MOLBACE and MOLTOX21 where we reduced it to 32 to avoid out-of-memory errors. We decay the learning rate by 0.5 every 300 epochs for all dataset except MOLHIV, where we followed the choices of Frasca et al. (2022), namely constant learning rate, downstream prediction network with 2 layers, embedding dimension 64 and dropout in between layers with probability in 0.5. We tuned the temperature parameter τ of the Gumbel-Softmax trick in {0.33, 0.66, 1, 2}. To prevent overconfidence in the probability distribution over nodes, we added a dropout during train, with probability tuned in {0, 0.3, 0.5}.