Microstructures and Accuracy of Graph Recall by Large Language Models

Authors: Yanbang Wang, Hejie Cui, Jon Kleinberg

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we perform the first systematical study of graph recall by LLMs, investigating the accuracy and biased microstructures (local subgraph patterns) in their recall.
Researcher Affiliation Academia Yanbang Wang Cornell University ywangdr@cs.cornell.edu Hejie Cui Stanford University hejie.cui@stanford.edu Jon Kleinberg Cornell University kleinberg@cs.cornell.edu
Pseudocode No The paper describes experimental protocols in narrative form and figures, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Our code and data can be downloaded at: https://github.com/Abel0828/llm-graph-recall.
Open Datasets Yes We create five graph datasets from the following application domains. (1) Co-authorship: DBLP (1995-2005); (2) Social network: Facebook [27]; (3) Geological network: CA road; (4) Protein interactions: Reactome [16]; (5) Erd os Rényi graph: as in [18].
Dataset Splits No The paper mentions datasets and splits for train/test evaluation (e.g. 20% edges removed for link prediction), but does not explicitly state specific validation set splits, percentages, or methodology.
Hardware Specification Yes For Llama Family models, we use the open-sourced models meta-llama/Llama-2-7b-hf and meta-llama/Llama-2-13b-hf on Hugging Face, tuned on two Quadro RTX 8000 GPUs with 48 GB of RAM.
Software Dependencies No The paper lists the LLM models and APIs used (e.g., GPT-3.5, GPT-4, Gemini-Pro, Llama 2), but does not provide specific version numbers for ancillary software dependencies such as programming languages, libraries, or frameworks.
Experiment Setup Yes We use zero-shot prompting with moderate formatting instructions for answers.