What Makes Graph Neural Networks Miscalibrated?
Authors: Hans Hao-Hsun Hsu, Yuesong Shen, Christian Tomani, Daniel Cremers
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we conduct a systematic study on the calibration qualities of GNN node predictions. Our experiments empirically verify the effectiveness of GATS, demonstrating that it can consistently achieve state-of-the-art calibration results on various graph datasets for different GNN backbones. |
| Researcher Affiliation | Academia | 1 Technical University of Munich, Germany 2 Munich Center for Machine Learning, Germany {hans.hsu, yuesong.shen, christian.tomani, cremers}@tum.de |
| Pseudocode | No | The paper does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Source code available at https://github.com/hans66hsu/GATS |
| Open Datasets | Yes | We train a series of GCN [11] and GAT [32] models on seven graph datasets: Cora [24], Citeseer [24], Pubmed [18], Amazon Computers [26], Amazon Photo [26], Coauthor CS [26], and Coauthor Physics [26]. ... for post-hoc calibration. |
| Dataset Splits | Yes | For all the experiments, we randomly split the labeled/unlabeled (15%/85%) data five times, and use three-fold internal cross-validation of the labeled data to train the GNNs and the calibrators. |
| Hardware Specification | No | The paper does not explicitly state specific hardware details (e.g., GPU models, CPU types, memory) used for running the experiments in the main text. It mentions in the ethical checklist that the resources used were included but does not elaborate on specifics within the visible content. |
| Software Dependencies | No | The paper does not explicitly list specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow) in the main text. |
| Experiment Setup | No | The paper mentions general aspects of the experimental setup, such as using negative log-likelihood as the objective for calibration. However, it states that "Details about model training are provided in Appendix A.2 for reproducibility" and "We provide detailed experimental settings in Appendix A.", but these specific hyperparameters (e.g., learning rates, batch sizes, epochs) are not provided in the main body of the paper. |