reproducibilityindex.ai

Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models

Authors: Xiaoman Pan, Wenlin Yao, Hongming Zhang, Dian Yu, Dong Yu, Jianshu Chen

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	By evaluating on 40+ different tasks, we show that Ki CLarge with 770M parameters easily outperforms large language models that are 4-39x larger. In addition, Ki C also exhibits emergent abilities at a much smaller model scale compared to the fully-parametric models.
Researcher Affiliation	Industry	Tencent AI Lab, Bellevue, WA 98004, USA
Pseudocode	No	The paper describes methods and equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper refers to using existing open-source models like MPNet ('We use All-MPNetbase-v25 as the encoder... and we use the publically available model checkpoint') but does not provide an explicit statement or link to the source code for the Ki C model itself.
Open Datasets	Yes	We adopt the same setting as T0 (Sanh et al., 2022), where we train Ki C models on a collection of tasks and then evaluate on another set of unseen tasks in a zero-shot manner. ... We train our Ki C model on a mixture of multiple tasks (39 tasks in total) by combining and shuffling all training instances from different tasks (8.4M in total).
Dataset Splits	Yes	Following standard approaches, we choose the prompt that yields the best accuracy (%) on the validation set. ... we reproduce T0Large with the same collection of tasks and evaluate Ki CLarge on the validation set of each in-domain task (Table 4).
Hardware Specification	Yes	Our final Ki CLarge model is trained with 128 V100 GPUs for 42 hours.
Software Dependencies	No	The paper mentions software like T5, MPNet, and SCaNN, but does not provide specific version numbers for these or other underlying software libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	The hyper-parameters of learning Ki CBase and Ki CLarge are listed in Table 9. In addition, we also list the hyper-parameters of single-task finetuning used in Table 10. Table 9 includes: Learning Rate, Max. Input Length, Max. Output Length, Batch Size, α, # epoch, Max. Knowledge Pieces.