Graph Reasoning Transformers for Knowledge-Aware Question Answering
Authors: Ruilin Zhao, Feng Zhao, Liang Hu, Guandong Xu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments conducted on three knowledge-intensive QA benchmarks show that the GRT outperforms the state-of-the-art KG-augmented QA systems, demonstrating the effectiveness and adaptation of our proposed model. |
| Researcher Affiliation | Academia | Ruilin Zhao1,3, Feng Zhao1*, Liang Hu2, Guandong Xu3 1Natural Language Processing and Knowledge Graph Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China 2College of Electronic and Information Engineering, Tongji University, Shanghai, China 3Data Science and Machine Intelligence Lab, University of Technology Sydney, Sydney, Australia {ruilinzhao,zhaof}@hust.edu.cn, lianghu@tongji.edu.cn, guandong.xu@uts.edu.au |
| Pseudocode | No | The paper describes methods and equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/HUSTNLP-codes/GRT |
| Open Datasets | Yes | Commonsense QA: This is a commonsense QA dataset... we adopt the in-house (IH) split (Lin et al. 2019) used in prior studies for evaluations. Openbook QA: This is a commonsense QA dataset... In this work, we utilize the official data splits (Mihaylov and Frank 2018) for our evaluations. Med QA-USMLE: This is a medical-domain QA dataset... In this work, we utilize the official data splits (Jin et al. 2020) for evaluation purposes. |
| Dataset Splits | Yes | For Commonsense QA: 'we adopt the in-house (IH) split (Lin et al. 2019) used in prior studies for evaluations.' For Openbook QA: 'we utilize the official data splits (Mihaylov and Frank 2018) for our evaluations.' For Med QA-USMLE: 'we utilize the official data splits (Jin et al. 2020) for evaluation purposes.' The tables also include 'IHdev-Acc', implying a development/validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory) used for running the experiments. It only discusses software components and models. |
| Software Dependencies | No | The paper mentions using 'RoBERTa-large' and 'SapBERT-Base' as LM backbones, along with 'Clinical BERT' and 'BioBERT' for comparison. However, it does not provide specific version numbers for these models or any other software libraries (e.g., PyTorch, TensorFlow, Python version) that would be needed for reproducible setup. |
| Experiment Setup | No | The paper describes the datasets and baseline models but does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or other detailed training configurations within the main text. |