reproducibilityindex.ai

Conversational Neuro-Symbolic Commonsense Reasoning

Authors: Forough Arabshahi, Jennifer Lee, Mikayla Gawarecki, Kathryn Mazaitis, Amos Azaria, Tom Mitchell4902-4911

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We release a benchmark data set for this task, collected from humans and annotated with commonsense presumptions. We present a neuro-symbolic theorem prover that extracts multi-hop reasoning chains, and apply it to this problem. Furthermore, to accommodate the reality that current AI commonsense systems lack full coverage, we also present an interactive conversational framework built on our neurosymbolic system, that conversationally evokes commonsense knowledge from humans to complete its reasoning chains. Our user-study shows (a) the plausibility of relying on humans to evoke commonsense knowledge and (b) the effectiveness of our theorem prover, enabling us to extract reasoning chains for up to 45% of the studied tasks1.
Researcher Affiliation	Collaboration	1Facebook, 2Carnegie Mellon University, 3Ariel University {forough, jenniferlee98}@fb.com, {mgawarec, krivard}@cs.cmu.edu, amos.azaria@ariel.ac.il, tom.mitchell@cs.cmu.edu
Pseudocode	No	The paper provides a flowchart (Figure 1) to illustrate CORGI's process but does not include formal pseudocode or an algorithm block.
Open Source Code	Yes	1The code and data are available here: https://github.com/ForoughA/CORGI
Open Datasets	Yes	1The code and data are available here: https://github.com/ForoughA/CORGI
Dataset Splits	No	The paper mentions training a neuro-symbolic theorem prover on proof traces but does not provide specific details about training, validation, or test dataset splits (e.g., percentages or sample counts).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The neural components of the theorem prover are implemented in Py Torch (Paszke et al. 2017) and the prover is built on top of s Pyrolog. (No version numbers provided for PyTorch or Pyrolog)
Experiment Setup	Yes	Mrule and Mvar are initialized randomly and with Glo Ve embeddings (Pennington, Socher, and Manning 2014), respectively, where m1 = 256 and m2 = 300. (Initialisation and dimension details are specific experimental setup details)