reproducibilityindex.ai

Microstructures and Accuracy of Graph Recall by Large Language Models

Authors: Yanbang Wang, Hejie Cui, Jon Kleinberg

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we perform the first systematical study of graph recall by LLMs, investigating the accuracy and biased microstructures (local subgraph patterns) in their recall.
Researcher Affiliation	Academia	Yanbang Wang Cornell University ywangdr@cs.cornell.edu Hejie Cui Stanford University hejie.cui@stanford.edu Jon Kleinberg Cornell University kleinberg@cs.cornell.edu
Pseudocode	No	The paper describes experimental protocols in narrative form and figures, but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code and data can be downloaded at: https://github.com/Abel0828/llm-graph-recall.
Open Datasets	Yes	We create five graph datasets from the following application domains. (1) Co-authorship: DBLP (1995-2005); (2) Social network: Facebook [27]; (3) Geological network: CA road; (4) Protein interactions: Reactome [16]; (5) Erd os Rényi graph: as in [18].
Dataset Splits	No	The paper mentions datasets and splits for train/test evaluation (e.g. 20% edges removed for link prediction), but does not explicitly state specific validation set splits, percentages, or methodology.
Hardware Specification	Yes	For Llama Family models, we use the open-sourced models meta-llama/Llama-2-7b-hf and meta-llama/Llama-2-13b-hf on Hugging Face, tuned on two Quadro RTX 8000 GPUs with 48 GB of RAM.
Software Dependencies	No	The paper lists the LLM models and APIs used (e.g., GPT-3.5, GPT-4, Gemini-Pro, Llama 2), but does not provide specific version numbers for ancillary software dependencies such as programming languages, libraries, or frameworks.
Experiment Setup	Yes	We use zero-shot prompting with moderate formatting instructions for answers.