Talk like a Graph: Encoding Graphs for Large Language Models

Authors: Bahare Fatemi, Jonathan Halcrow, Bryan Perozzi

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we perform the first comprehensive study of encoding graph-structured data as text for consumption by LLMs. We show that LLM performance on graph reasoning tasks varies on three fundamental levels: (1) the graph encoding method, (2) the nature of the graph task itself, and (3) interestingly, the very structure of the graph considered. These novel results provide valuable insight on strategies for encoding graphs as text. Using these insights we illustrate how the correct choice of encoders can boost performance on graph reasoning tasks inside LLMs by 4.8% to 61.8%, depending on the task.
Researcher Affiliation Industry Bahare Fatemi, Jonathan Halcrow, Bryan Perozzi Google Research {baharef,halcrow,bperozzi}@google.com
Pseudocode No The paper describes methods and processes in text and with diagrams, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The code to generate the data is available at https://github.com/google-research/google-research/tree/master/graphqa. We are committed to open-sourcing both our code and data upon the acceptance of our paper.
Open Datasets Yes Graph QA is distinguished by using graphs with much more varied and realistic graph structure than has previously been studied with LLMs1. 1The code to generate the data is available at https://github.com/google-research/google-research/tree/master/graphqa.
Dataset Splits No The paper mentions generating graphs and using 'few-shot examples' for prompting, but does not provide specific details on how the generated Graph QA data is split into training, validation, and test sets, or specify cross-validation methods.
Hardware Specification Yes For our experiments, we used Pa LM 62B and Pa LM 2 (various sizes) served on a 4 4 TPU v4 architecture.
Software Dependencies Yes We used the Network X library (Hagberg et al., 2008) to generate the random graphs and to find the answers to the graph tasks.
Experiment Setup Yes The decoding temperature was set to zero.