reproducibilityindex.ai

MindMap: Constructing Evidence Chains for Multi-Step Reasoning in Large Language Models

Authors: Yangyu Wu, Xu Han, Wei Song, Miaomiao Cheng, Fei Li

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results on the b Ab I and Proof Writer OWA datasets demonstrate the effectiveness of Mind Map.
Researcher Affiliation	Academia	1Beijing Key Laboratory of Electronic System Reliability Technology, College of Information Engineering, Capital Normal University, China 2School of Cyber Science and Engineering, Wuhan University, China
Pseudocode	No	The paper includes workflow diagrams (Figure 1 and Figure 2) but does not contain structured pseudocode or algorithm blocks clearly labeled as "Pseudocode" or "Algorithm".
Open Source Code	No	The paper does not contain any statement about releasing its source code for the described methodology or a link to a repository.
Open Datasets	Yes	Our experiments are conducted using two challenging multi-step logical reasoning datasets. b Ab I (Weston et al. 2016): [...] Proof Writer OWA (Tafjord, Dalvi, and Clark 2021):
Dataset Splits	No	The paper mentions using specific tasks (tasks 1-3 of b Ab I) and subsets (Proof Writer-PUD, Proof Writer-PD) and indicates using the full test set or a subset of the test set, but it does not provide explicit training, validation, and test split percentages, sample counts, or detailed splitting methodology.
Hardware Specification	No	The paper mentions "computation resource constraints" and models like "Vicuna-13B" but does not provide specific hardware details such as GPU or CPU models, memory specifications, or detailed computer specifications used for running its experiments.
Software Dependencies	No	The paper mentions using the "Vicuna-13B model" and the "Stanford Core NLP toolkit" but does not provide specific version numbers for these or other software components used in the experiments.
Experiment Setup	Yes	The standard, Co T and SI frameworks all adopt 5-shot setting, and use the same examples for constructing demonstration prompts. Details about prompt construction for these frameworks are based on the settings described in the appendix of the SI paper.