Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot Commonsense Question Answering
Authors: Antoine Bosselut, Ronan Le Bras, Yejin Choi4923-4931
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results on two datasets demonstrate the efficacy of our neuro-symbolic approach for dynamically constructing knowledge graphs for reasoning. Our approach achieves significant performance boosts over pretrained language models and vanilla knowledge models, all while providing interpretable reasoning paths for its predictions. |
| Researcher Affiliation | Collaboration | Antoine Bosselut,1,2 Ronan Le Bras,2 Yejin Choi3,2 1Stanford University 2Allen Institute for Artificial Intelligence 3Paul G. Allen School of Computer Science & Engineering, University of Washington antoineb@cs.stanford.edu, {ronanlb,yejinc}@allenai.org |
| Pseudocode | Yes | Algorithm # nodes # edges φℓ a φℓ amax Argmax Decoding 10.6 26.4 50.1 49.6 Beam Search 5 43.2 156.8 49.5 49.1 Beam Search 10 83.0 316.2 50.0 49.1 Top-5 sampling 32.0 111.9 49.0 49.0 Top-10 sampling 59.9 223.8 49.3 49.4 |
| Open Source Code | No | The paper does not provide a direct link or explicit statement about the release of source code for the described methodology. |
| Open Datasets | Yes | We evaluate our method on two datasets: Social IQa (Sap et al. 2019b) and Story CS (Rashkin et al. 2018). |
| Dataset Splits | Yes | Additionally, our approach of dynamically constructing a knowledge graph on demand (COMET Dyna Gen) performs better than using the knowledge model to directly evaluate answers (COMET Direct) by 3.6%, highlighting the value in representing more complex reasoning paths. Finally, the improvement over Self-Talk depicts the benefit of using a structured graphical representation for reasoning compared to one that uses language models to generate additional situational context sentences for conditioning. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments with specific model numbers or specifications. |
| Software Dependencies | No | The paper mentions using GPT2-345M (Radford et al. 2019) as the pretrained language model that seeds COMET, but it does not specify software versions for other dependencies like Python, PyTorch, etc. |
| Experiment Setup | Yes | We use most of the same hyperparameters to train the COMET model on the Atomic knowledge graph as in Bosselut et al. (2019). However, we use GPT2345M (Radford et al. 2019) as the pretrained language model that seeds COMET and freeze the position embeddings so we can generalize to longer contexts. The number of levels in the graph L is set to 2. As we operate in the zero-shot setting, we do not tune hyperparameters. For the Social IQa dataset, we set γg = γga = 1.0 and βℓ= 1.0 ℓ. For Story CS, we do the same except that γg = 0. Unless stated otherwise, we use argmax decoding to generate inferences from COMET, and use variable elimination over the graph to select answers. |