Graph Neural Prompting with Large Language Models

Authors: Yijun Tian, Huan Song, Zichen Wang, Haozhu Wang, Ziqing Hu, Fang Wang, Nitesh V. Chawla, Panpan Xu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on multiple datasets demonstrate the superiority of GNP on both commonsense and biomedical reasoning tasks across different LLM sizes and settings.
Researcher Affiliation Collaboration Yijun Tian1, Huan Song2, Zichen Wang2, Haozhu Wang2, Ziqing Hu2, Fang Wang2, Nitesh V. Chawla1, Panpan Xu2 1University of Notre Dame 2Amazon
Pseudocode No The paper describes its method in text and mathematical equations but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Code is available at https://github.com/meettyj/GNP.
Open Datasets Yes For the used knowledge graphs, we consider Concept Net (Speer, Chin, and Havasi 2017) that contains rich commonsense knowledge regarding the daily concepts, and Unified Medical Language System (UMLS) (Bodenreider 2004) that involves well-structured health and biomedical information. For datasets, we use four commonsense reasoning datasets, including Open Book QA (OBQA) (Mihaylov et al. 2018), AI2 Reasoning Challenge (ARC) (Clark et al. 2018), Physical Interaction Question Answering (PIQA) (Bisk et al. 2020), and Riddle Sense (Riddle) (Lin et al. 2021). In addition, we consider Pub Med QA (PQA) (Jin et al. 2019) and Bio ASQ (Tsatsaronis et al. 2015) for biomedical reasoning.
Dataset Splits Yes Implementation Details. For the proposed model, we set the learning rate to 1e-4, batch size to 8, hidden dimension of GNN to 1024, and training epochs to 50. In order to adapt the model effectively to each dataset, we search the GNN layers from 2 to 5, cross-modality pooling layers from 1 to 3, trade-off weight λ from {0.1, 0.5}, and link drop rate from {0.1, 0.3, 0.7}.
Hardware Specification Yes We run all experiments on four NVIDIA Tesla V100 GPUs with 24GB RAM.
Software Dependencies No The paper mentions using FLAN-T5 LLMs and specifies hyper-parameters, but does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes For the proposed model, we set the learning rate to 1e-4, batch size to 8, hidden dimension of GNN to 1024, and training epochs to 50. In order to adapt the model effectively to each dataset, we search the GNN layers from 2 to 5, cross-modality pooling layers from 1 to 3, trade-off weight λ from {0.1, 0.5}, and link drop rate from {0.1, 0.3, 0.7}.