reproducibilityindex.ai

(Comet-) Atomic 2020: On Symbolic and Neural Commonsense Knowledge Graphs

Authors: Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sakaguchi, Antoine Bosselut, Yejin Choi6384-6392

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate its properties in comparison with other leading CSKGs, performing the ﬁrst large-scale pairwise study of commonsense knowledge resources. Next, we show that ATOMIC20 20 is better suited for training knowledge models that can generate accurate, representative knowledge for new, unseen entities and events. Finally, through human evaluation, we show that the few-shot performance of GPT-3 (175B parameters), while impressive, remains 12 absolute points lower than a BART-based knowledge model trained on ATOMIC20 20 despite using over 430x fewer parameters.
Researcher Affiliation	Collaboration	Jena D. Hwang1 , Chandra Bhagavatula1 , Ronan Le Bras1, Jeff Da1, Keisuke Sakaguchi1, Antoine Bosselut13 and Yejin Choi12 1 Allen Institute for AI, WA, USA 2 Paul G. Allen School of Computer Science & Engineering, WA, USA 3 Stanford University, CA, USA
Pseudocode	No	The paper describes its models and methods in detail but does not include any explicit pseudocode blocks, algorithms, or structured procedural steps formatted like code.
Open Source Code	No	The paper does not contain any statements about releasing its source code, nor does it provide a link to a code repository for the methodology described.
Open Datasets	Yes	In this work, we evaluate three existing knowledge graphs, CONCEPTNET, ATOMIC, and TRANSOMCS on their coverage and precision relative to our new resource ATOMIC20 20. The CONCEPTNET (v5.7) knowledge graph (Speer, Chin, and Havasi 2017)... The ATOMIC (Sap et al. 2019) knowledge graph... The TRANSOMCS (Zhang et al. 2020a) knowledge graph...
Dataset Splits	Yes	To evaluate whether knowledge graphs can help language models effectively transfer to knowledge models, we train different pretrained language models on the knowledge graphs described in Section 4... We split each knowledge graph into training, validation, and test sets such that the heads of the knowledge tuples do not overlap between these sets.
Hardware Specification	No	Computations on beaker.org were supported in part by credits from Google Cloud. TPU machines for conducting experiments were provided by Google.
Software Dependencies	No	The paper mentions using GPT2, BART, and GPT-3 models but does not specify the version numbers of any software libraries, frameworks (e.g., PyTorch, TensorFlow), or other dependencies used for implementation.
Experiment Setup	Yes	The hyperparameter settings used for training are described in more detail in Appendix. Additionally, we use GPT2-XL in a zero-shot setting as a baseline to measure the effect of transfer learning on knowledge graphs. ... Additional training details are provided in Appendix. ... Additional details of our implementation are provided in Appendix.