Distribution Free Prediction Sets for Node Classification

Authors: Jase Clarkson

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show through experiments on standard benchmark datasets using popular GNN models that our approach provides tighter and better calibrated prediction sets than a naive application of conformal prediction.
Researcher Affiliation Academia Department of Statistics, University of Oxford.
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes The code is available at this link.
Open Datasets Yes We apply our method to three popular node classification datasets, namely Reddit2 and Flickr introduced in (Zeng et al., 2020) and Amazon Computers introduced in (Shchur et al., 2018).
Dataset Splits Yes Our train/validation/test splits for Flickr and Reddit2 were done using the splits given in the original papers (which are conveniently implemented in Pytorch Geometric (Fey and Lenssen, 2019)). For Amazon Computers we constructed our own split, using 752 nodes for training, 1000 for validation and the remaining 12000 for testing.
Hardware Specification Yes Each experiment here took less than two hours in total on a single machine with an NVIDIA Ge Force RTX 2060 SUPER GPU and an AMD Ryzen 7 3700X 8-Core Processor.
Software Dependencies No The paper mentions software components like 'Pytorch Geometric (Fey and Lenssen, 2019)' and 'Adam optimiser (Kingma and Ba, 2014)' but does not provide specific version numbers for these libraries or frameworks, which are necessary for full reproducibility.
Experiment Setup Yes Each GNN used 2 layers with hidden dimension H = 64. We used the Adam optimiser (Kingma and Ba, 2014) with default hyper-parameters, learning rate η = 0.1, and used dropout probability δ = 0.5. For the Graph SAGE neighbour sampling training we used 25 1-hop neighbours and 10 2-hop neighbours. We used early stopping based on the accuracy on the validation set.