(Comet-) Atomic 2020: On Symbolic and Neural Commonsense Knowledge Graphs
Authors: Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sakaguchi, Antoine Bosselut, Yejin Choi6384-6392
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate its properties in comparison with other leading CSKGs, performing the first large-scale pairwise study of commonsense knowledge resources. Next, we show that ATOMIC20 20 is better suited for training knowledge models that can generate accurate, representative knowledge for new, unseen entities and events. Finally, through human evaluation, we show that the few-shot performance of GPT-3 (175B parameters), while impressive, remains 12 absolute points lower than a BART-based knowledge model trained on ATOMIC20 20 despite using over 430x fewer parameters. |
| Researcher Affiliation | Collaboration | Jena D. Hwang1 , Chandra Bhagavatula1 , Ronan Le Bras1, Jeff Da1, Keisuke Sakaguchi1, Antoine Bosselut13 and Yejin Choi12 1 Allen Institute for AI, WA, USA 2 Paul G. Allen School of Computer Science & Engineering, WA, USA 3 Stanford University, CA, USA |
| Pseudocode | No | The paper describes its models and methods in detail but does not include any explicit pseudocode blocks, algorithms, or structured procedural steps formatted like code. |
| Open Source Code | No | The paper does not contain any statements about releasing its source code, nor does it provide a link to a code repository for the methodology described. |
| Open Datasets | Yes | In this work, we evaluate three existing knowledge graphs, CONCEPTNET, ATOMIC, and TRANSOMCS on their coverage and precision relative to our new resource ATOMIC20 20. The CONCEPTNET (v5.7) knowledge graph (Speer, Chin, and Havasi 2017)... The ATOMIC (Sap et al. 2019) knowledge graph... The TRANSOMCS (Zhang et al. 2020a) knowledge graph... |
| Dataset Splits | Yes | To evaluate whether knowledge graphs can help language models effectively transfer to knowledge models, we train different pretrained language models on the knowledge graphs described in Section 4... We split each knowledge graph into training, validation, and test sets such that the heads of the knowledge tuples do not overlap between these sets. |
| Hardware Specification | No | Computations on beaker.org were supported in part by credits from Google Cloud. TPU machines for conducting experiments were provided by Google. |
| Software Dependencies | No | The paper mentions using GPT2, BART, and GPT-3 models but does not specify the version numbers of any software libraries, frameworks (e.g., PyTorch, TensorFlow), or other dependencies used for implementation. |
| Experiment Setup | Yes | The hyperparameter settings used for training are described in more detail in Appendix. Additionally, we use GPT2-XL in a zero-shot setting as a baseline to measure the effect of transfer learning on knowledge graphs. ... Additional training details are provided in Appendix. ... Additional details of our implementation are provided in Appendix. |