Large Language Models Are Neurosymbolic Reasoners
Authors: Meng Fang, Shilong Deng, Yudi Zhang, Zijing Shi, Ling Chen, Mykola Pechenizkiy, Jun Wang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results demonstrate that our method significantly enhances the capability of LLMs as automated agents for symbolic reasoning, and our LLM agent is effective in text-based games involving symbolic tasks, achieving an average performance of 88% across all tasks. Experiments We demonstrate the potential of LLMs in serving as neurosymbolic reasoners for text-based games. In particular, we present experimental results on four text-based games that involve different symbolic tasks. In these tasks, we observe that LLMs can effectively function as symbolic reasoners. |
| Researcher Affiliation | Academia | Meng Fang*1,2, Shilong Deng*1, Yudi Zhang*2, Zijing Shi3, Ling Chen3, Mykola Pechenizkiy2, Jun Wang4 1University of Liverpool, United Kingdom 2Eindhoven University of Technology, Netherlands 3University of Technology Sydney, Australia 4University College London, United Kingdom |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Code at: https://github.com/hyintell/LLMSymbolic. |
| Open Datasets | Yes | Environments We use four text-based game benchmark environments (Wang et al. 2022b): The evaluation includes four text-based games involving symbolic tasks. Each task is divided into Train , Dev , and Test sets. |
| Dataset Splits | Yes | Each task is divided into Train , Dev , and Test sets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | For our LLM agent, we use GPT-3.5-turbo. The paper mentions the use of GPT-3.5-turbo and refers to Open AI, but it does not list specific software libraries or frameworks with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | Role Initialization. We initialize the agent by providing them with task descriptions and action constraints. Action Query: This step is repeated at each timestep. We prompt the LLM agent with the current observation, inventory state, valid action set, and a question. The prompting format for role initialization and action query for each time step is provided in Table 2. The prompting format for adding constraints on the actions of an agent is provided in Table 3. |