(Comet-) Atomic 2020: On Symbolic and Neural Commonsense Knowledge Graphs

Authors: Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sakaguchi, Antoine Bosselut, Yejin Choi6384-6392

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate its properties in comparison with other leading CSKGs, performing the first large-scale pairwise study of commonsense knowledge resources. Next, we show that ATOMIC20 20 is better suited for training knowledge models that can generate accurate, representative knowledge for new, unseen entities and events. Finally, through human evaluation, we show that the few-shot performance of GPT-3 (175B parameters), while impressive, remains 12 absolute points lower than a BART-based knowledge model trained on ATOMIC20 20 despite using over 430x fewer parameters.
Researcher Affiliation Collaboration Jena D. Hwang1 , Chandra Bhagavatula1 , Ronan Le Bras1, Jeff Da1, Keisuke Sakaguchi1, Antoine Bosselut13 and Yejin Choi12 1 Allen Institute for AI, WA, USA 2 Paul G. Allen School of Computer Science & Engineering, WA, USA 3 Stanford University, CA, USA
Pseudocode No The paper describes its models and methods in detail but does not include any explicit pseudocode blocks, algorithms, or structured procedural steps formatted like code.
Open Source Code No The paper does not contain any statements about releasing its source code, nor does it provide a link to a code repository for the methodology described.
Open Datasets Yes In this work, we evaluate three existing knowledge graphs, CONCEPTNET, ATOMIC, and TRANSOMCS on their coverage and precision relative to our new resource ATOMIC20 20. The CONCEPTNET (v5.7) knowledge graph (Speer, Chin, and Havasi 2017)... The ATOMIC (Sap et al. 2019) knowledge graph... The TRANSOMCS (Zhang et al. 2020a) knowledge graph...
Dataset Splits Yes To evaluate whether knowledge graphs can help language models effectively transfer to knowledge models, we train different pretrained language models on the knowledge graphs described in Section 4... We split each knowledge graph into training, validation, and test sets such that the heads of the knowledge tuples do not overlap between these sets.
Hardware Specification No Computations on beaker.org were supported in part by credits from Google Cloud. TPU machines for conducting experiments were provided by Google.
Software Dependencies No The paper mentions using GPT2, BART, and GPT-3 models but does not specify the version numbers of any software libraries, frameworks (e.g., PyTorch, TensorFlow), or other dependencies used for implementation.
Experiment Setup Yes The hyperparameter settings used for training are described in more detail in Appendix. Additionally, we use GPT2-XL in a zero-shot setting as a baseline to measure the effect of transfer learning on knowledge graphs. ... Additional training details are provided in Appendix. ... Additional details of our implementation are provided in Appendix.