reproducibilityindex.ai

Large Language Models Are Neurosymbolic Reasoners

Authors: Meng Fang, Shilong Deng, Yudi Zhang, Zijing Shi, Ling Chen, Mykola Pechenizkiy, Jun Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results demonstrate that our method significantly enhances the capability of LLMs as automated agents for symbolic reasoning, and our LLM agent is effective in text-based games involving symbolic tasks, achieving an average performance of 88% across all tasks. Experiments We demonstrate the potential of LLMs in serving as neurosymbolic reasoners for text-based games. In particular, we present experimental results on four text-based games that involve different symbolic tasks. In these tasks, we observe that LLMs can effectively function as symbolic reasoners.
Researcher Affiliation	Academia	Meng Fang1,2, Shilong Deng1, Yudi Zhang*2, Zijing Shi3, Ling Chen3, Mykola Pechenizkiy2, Jun Wang4 1University of Liverpool, United Kingdom 2Eindhoven University of Technology, Netherlands 3University of Technology Sydney, Australia 4University College London, United Kingdom
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	1Code at: https://github.com/hyintell/LLMSymbolic.
Open Datasets	Yes	Environments We use four text-based game benchmark environments (Wang et al. 2022b): The evaluation includes four text-based games involving symbolic tasks. Each task is divided into Train , Dev , and Test sets.
Dataset Splits	Yes	Each task is divided into Train , Dev , and Test sets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	For our LLM agent, we use GPT-3.5-turbo. The paper mentions the use of GPT-3.5-turbo and refers to Open AI, but it does not list specific software libraries or frameworks with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup	Yes	Role Initialization. We initialize the agent by providing them with task descriptions and action constraints. Action Query: This step is repeated at each timestep. We prompt the LLM agent with the current observation, inventory state, valid action set, and a question. The prompting format for role initialization and action query for each time step is provided in Table 2. The prompting format for adding constraints on the actions of an agent is provided in Table 3.