reproducibilityindex.ai

G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering

Authors: Xiaoxin He, Yijun Tian, Yifei Sun, Nitesh Chawla, Thomas Laurent, Yann LeCun, Xavier Bresson, Bryan Hooi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations show that our method outperforms baselines on textual graph tasks from multiple domains, scales well with larger graph sizes, and mitigates hallucination. Our codes and datasets are available at: https://github.com/Xiaoxin He/G-Retriever.
Researcher Affiliation	Collaboration	1National University of Singapore 2University of Notre Dame 3Loyola Marymount University 4New York University 5Meta AI
Pseudocode	No	The paper describes the steps of the G-Retriever method but does not include a dedicated pseudocode or algorithm block.
Open Source Code	Yes	Our codes and datasets are available at: https://github.com/Xiaoxin He/G-Retriever.
Open Datasets	Yes	Our codes and datasets are available at: https://github.com/Xiaoxin He/G-Retriever. and Our Graph QA benchmark integrates three existing datasets: Expla Graphs, Scene Graphs, and Web QSP.
Dataset Splits	Yes	We further divided these into training, val, and test subsets, using a 6:2:2 ratio. and We randomly sampled 100k samples from the original dataset and divided them into training, validation, and test subsets, following a 6:2:2 ratio.
Hardware Specification	Yes	Experiments are conducted using 2 NVIDIA A100-80G GPUs. and Utilizing two A100 GPUs, each with 80GB of memory, we conducted tests on Llama2-7b and Web QSP datasets.
Software Dependencies	Yes	In the indexing step, we use Sentence Bert [34] as the LM to encode all node and edge attributes. In the generation step, we use the open-source Llama2-7b [42] as the LLM and Graph Transformer [37] as the graph encoder.
Experiment Setup	Yes	In fine-tuning the LLM with Lo RA [10], the lora_r parameter (dimension for Lo RA update matrices) is set to 8, and lora_alpha (scaling factor) is set to 16. The dropout rate is set to 0.05. In prompt tuning, the LLM is configured with 10 virtual tokens. The number of max text length is 512, the number of max new tokens, i.e., the maximum numbers of tokens to generate, is 32. and We set the initial learning rate at 1e-5, with a weight decay of 0.05. The learning rate decays with a half-cycle cosine decay after the warm-up period. The batch size is 4, and the number of epochs is 10. To prevent overfitting and ensure training efficiency, an early stopping mechanism is implemented with a patience setting of 2 epochs.