Towards Understanding Factual Knowledge of Large Language Models
Authors: Xuming Hu, Junzhe Chen, Xiaochuan Li, Yufei Guo, Lijie Wen, Philip S. Yu, Zhijiang Guo
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on different sizes and types of LLMs show that existing LLMs still lack factual knowledge and suffer from various spurious correlations. |
| Researcher Affiliation | Academia | 1 Tsinghua University 2 The Hong Kong University of Science and Technology (Guangzhou) 3 University of Illinois at Chicago 4 University of Cambridge |
| Pseudocode | No | The paper describes methods in text but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | The dataset Pinocchio and our codes are publicly available at: https://github.com/THU-BPM/Pinocchio. |
| Open Datasets | Yes | The dataset Pinocchio and our codes are publicly available at: https://github.com/THU-BPM/Pinocchio. |
| Dataset Splits | No | The paper describes the Pinocchio dataset and its subsets but does not explicitly state specific training/validation/test dataset splits (e.g., percentages or counts) within the main text. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used for running its experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper does not provide specific software dependencies, such as library names with version numbers, used to replicate the experiments. |
| Experiment Setup | No | While the paper describes various prompt strategies used in experiments, it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size) or detailed system-level training settings. |