Conformal Prediction Sets for Graph Neural Networks

Authors: Soroush H. Zargarbashi, Simone Antonelli, Aleksandar Bojchevski

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6. Experimental Evaluation We study the: (i) the impact of diffusion on efficiency and singleton hit ratio for semi-supervised node classification, (ii) the stability of all methods to random sampling and their sensitivity to hyperparameters, (iii) and the performance when we only have hard predictions. We compare our approach with the two strongest baselines APS and RAPS, which do not explicitly take the graph structure into account (although implicitly the graph is used to produce the probability vectors πi). For experimental evaluation, we put our main focus on the transductive case. We also provide some experiments for inductive and simultaneous inductive settings in E.10.
Researcher Affiliation Academia 1CISPA Helmholtz Center for Information Security 2University of Cologne. Correspondence to: Soroush H. Zargarbashi <sayed.haj-zargarbashi@cispa.de>, Simone Antonelli <simone.antonelli@cispa.de>, Aleksandar Bojchevski <a.bojchevski@uni-koeln.de>.
Pseudocode Yes Algorithm 1 Conformal prediction pseudo-code
Open Source Code Yes In addition to the mentioned algorithm, the Python implementation including the code to reproduce reported results is accessible at https://github.com/soroushzargar/DAPS.
Open Datasets Yes We evaluate our approach on 10 datasets. The common citation graphs Cora ML (Mc Callum et al., 2004), Cite Seer (Sen et al., 2008), Pub Med (Namata et al., 2012), Cora Full(Bojchevski & G unnemann, 2018), Coauthor Physics and Coauthor CS (Shchur et al., 2018). The co-purchase graphs Amazon Photos and Amazon Computers (Mc Auley et al., 2015; Shchur et al., 2018). And two large graphs, OGBN Arxiv (Wang et al., 2020) and OGBN Products (Bhatia et al., 2016).
Dataset Splits Yes We randomly split the nodes into train/validation/calibration/test sets. ... We randomly select 20 nodes per class for training/validation.
Hardware Specification Yes We run all our experiments both on CPU (Intel(R) Xeon(R) Platinum 8368 CPU @ 2.40GHz) and, even if not necessary, on GPU (NVIDIA A100-SXM4-40GB).
Software Dependencies No We based our implementation on Py Torch Geometric (Fey & Lenssen, 2019). The paper mentions a specific library but does not provide version numbers for it or other software dependencies.
Experiment Setup Yes We randomly select 20 nodes per class for training/validation. As described in B, we split the calibration set into two sets, one for tuning parameters like λ, and one for actual calibration.