Degree-Quant: Quantization-Aware Training for Graph Neural Networks

Authors: Shyam Anil Tailor, Javier Fernandez-Marques, Nicholas Donald Lane

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our method on six datasets and show, unlike previous attempts, that models generalize to unseen graphs.
Researcher Affiliation Collaboration Shyam A. Tailor Department of Computer Science & Technology University of Cambridge Javier Fernandez-Marques* Department of Computer Science University of Oxford Nicholas D. Lane Department of Computer Science and Technology University of Cambridge & Samsung AI Center
Pseudocode Yes Algorithm 1 Degree-Quant (DQ). Functions accepting a protective mask m perform only the masked computations at full precision: intermediate tensors are not quantized. At test time protective masking is disabled. In fig. 11 (in the Appendix) we show with a diagram how a GCN layers makes use of DQ.
Open Source Code Yes We provide code at this URL: https://github.com/camlsys/degree-quant.
Open Datasets Yes The datasets used were Cora, Cite Seer, ZINC, MNIST and CIFAR10 superpixels, and REDDIT-BINARY.
Dataset Splits Yes We use standard splits for MNIST, CIFAR-10 and ZINC. For citation datasets (Cora and Citeseer), we use the splits used by Kipf & Welling (2017). For REDDIT-BINARY we use 10-fold cross validation.
Hardware Specification Yes Our experiments ran on several machines in our SLURM cluster using Intel CPUs and NVIDIA GPUs. Each machine was running Ubuntu 18.04. The GPU models in our cluster were: V100, RTX 2080Ti and GTX 1080Ti.
Software Dependencies Yes Our infrastructure was implemented using Py Torch Geometric (Py G) (Fey & Lenssen, 2019). Our code depends on Py Torch Geometric (Fey & Lenssen, 2019). These snippets should be compatible with Python 3.7 and Py Torch Geometric version 1.4.3.
Experiment Setup Yes Hyperparameters searched over were learning rate, weight decay, and dropout (Srivastava et al., 2014) and drop-edge (Rong et al., 2020) probabilities. [...] Degree-Quant requires searching for two additional hyperparameters, pmin and pmax, these were tuned in a grid-search fashion.