reproducibilityindex.ai

What Makes Graph Neural Networks Miscalibrated?

Authors: Hans Hao-Hsun Hsu, Yuesong Shen, Christian Tomani, Daniel Cremers

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we conduct a systematic study on the calibration qualities of GNN node predictions. Our experiments empirically verify the effectiveness of GATS, demonstrating that it can consistently achieve state-of-the-art calibration results on various graph datasets for different GNN backbones.
Researcher Affiliation	Academia	1 Technical University of Munich, Germany 2 Munich Center for Machine Learning, Germany {hans.hsu, yuesong.shen, christian.tomani, cremers}@tum.de
Pseudocode	No	The paper does not include a clearly labeled pseudocode or algorithm block.
Open Source Code	Yes	Source code available at https://github.com/hans66hsu/GATS
Open Datasets	Yes	We train a series of GCN [11] and GAT [32] models on seven graph datasets: Cora [24], Citeseer [24], Pubmed [18], Amazon Computers [26], Amazon Photo [26], Coauthor CS [26], and Coauthor Physics [26]. ... for post-hoc calibration.
Dataset Splits	Yes	For all the experiments, we randomly split the labeled/unlabeled (15%/85%) data five times, and use three-fold internal cross-validation of the labeled data to train the GNNs and the calibrators.
Hardware Specification	No	The paper does not explicitly state specific hardware details (e.g., GPU models, CPU types, memory) used for running the experiments in the main text. It mentions in the ethical checklist that the resources used were included but does not elaborate on specifics within the visible content.
Software Dependencies	No	The paper does not explicitly list specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow) in the main text.
Experiment Setup	No	The paper mentions general aspects of the experimental setup, such as using negative log-likelihood as the objective for calibration. However, it states that "Details about model training are provided in Appendix A.2 for reproducibility" and "We provide detailed experimental settings in Appendix A.", but these specific hyperparameters (e.g., learning rates, batch sizes, epochs) are not provided in the main body of the paper.