reproducibilityindex.ai

Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models

Authors: Shangbin Feng, Weijia Shi, Yuyang Bai, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments, we demonstrate that KNOWLEDGE CARD achieves state-of-the-art performance on six benchmark datasets. Ultimately, KNOWLEDGE CARD framework enables dynamic synthesis and updates of knowledge from diverse domains.
Researcher Affiliation	Academia	Shangbin Feng1 Weijia Shi1 Yuyang Bai2 Vidhisha Balachandran3 Tianxing He1 Yulia Tsvetkov1 1University of Washington 2Xi an Jiaotong University 3Carnegie Mellon University
Pseudocode	Yes	Algorithm 1: Bottom-Up Approach ... Algorithm 2: Top-Down Approach
Open Source Code	Yes	1Resources are available at https://github.com/Bunsen Feng/Knowledge Card.
Open Datasets	Yes	For general-purpose QA, we adopt MMLU (Hendrycks et al., 2020)... To evaluate multi-domain knowledge synthesis, we adopt misinformation detection... We leverage the widely adopted LUN misinformation detection dataset (Rashkin et al., 2017)...
Dataset Splits	No	The paper mentions '5-shot in-context learning setting' and an official 'demonstration set' for MMLU and MIDTERMQA, and '16-shot in-context learning' for LUN, which are used for few-shot learning. However, it does not specify a distinct 'validation' split with percentages or counts for hyperparameter tuning or model selection.
Hardware Specification	Yes	We used a GPU cluster with 16 NVIDIA A40 GPUs, 1988G memory, and 104 CPU cores for the experiments.
Software Dependencies	No	The paper lists specific models and tools used (e.g., OPT-1.3B, MPNet, Pegasus, Codex, Fact KB, Vitamin C) along with their citations, but it does not provide specific version numbers for the underlying software libraries or environment (e.g., Python version, PyTorch/TensorFlow version, CUDA version).
Experiment Setup	Yes	We present hyperparameter settings in Table 6. ... LEARNING RATE 2e-5, BATCH SIZE 32, MAX EPOCHS 10, OPTIMIZER ADAM, TEMPERATURE 0.1