What Makes Graph Neural Networks Miscalibrated?

Authors: Hans Hao-Hsun Hsu, Yuesong Shen, Christian Tomani, Daniel Cremers

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we conduct a systematic study on the calibration qualities of GNN node predictions. Our experiments empirically verify the effectiveness of GATS, demonstrating that it can consistently achieve state-of-the-art calibration results on various graph datasets for different GNN backbones.
Researcher Affiliation Academia 1 Technical University of Munich, Germany 2 Munich Center for Machine Learning, Germany {hans.hsu, yuesong.shen, christian.tomani, cremers}@tum.de
Pseudocode No The paper does not include a clearly labeled pseudocode or algorithm block.
Open Source Code Yes Source code available at https://github.com/hans66hsu/GATS
Open Datasets Yes We train a series of GCN [11] and GAT [32] models on seven graph datasets: Cora [24], Citeseer [24], Pubmed [18], Amazon Computers [26], Amazon Photo [26], Coauthor CS [26], and Coauthor Physics [26]. ... for post-hoc calibration.
Dataset Splits Yes For all the experiments, we randomly split the labeled/unlabeled (15%/85%) data five times, and use three-fold internal cross-validation of the labeled data to train the GNNs and the calibrators.
Hardware Specification No The paper does not explicitly state specific hardware details (e.g., GPU models, CPU types, memory) used for running the experiments in the main text. It mentions in the ethical checklist that the resources used were included but does not elaborate on specifics within the visible content.
Software Dependencies No The paper does not explicitly list specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow) in the main text.
Experiment Setup No The paper mentions general aspects of the experimental setup, such as using negative log-likelihood as the objective for calibration. However, it states that "Details about model training are provided in Appendix A.2 for reproducibility" and "We provide detailed experimental settings in Appendix A.", but these specific hyperparameters (e.g., learning rates, batch sizes, epochs) are not provided in the main body of the paper.