Learning to Solve Geometry Problems via Simulating Human Dual-Reasoning Process

Authors: Tong Xiao, Jiayu Liu, Zhenya Huang, Jinze Wu, Jing Sha, Shijin Wang, Enhong Chen

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on two benchmark datasets, Geo QA and Geo QA+. The results demonstrate the superiority of Dual Geo Solver in both solving accuracy and robustness from explicitly modeling human reasoning process and knowledge application.
Researcher Affiliation Collaboration Tong Xiao1 , Jiayu Liu1 , Zhenya Huang1,2 , Jinze Wu4 , Jing Sha4 , Shijin Wang3,4 , Enhong Chen 1,3 1University of Science and Technology of China 2Institute of Artificial Intelligence, Hefei Comprehensive National Science Center 3State Key Laboratory of Cognitive Intelligence 4i FLYTEK AI Research
Pseudocode No The paper describes the methodology using prose and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes The source code and datasets are available at https:// github.com/tongxiao2002/Dual Geo Solver
Open Datasets Yes We conduct experiments on two public available datasets: Geo QA and Geo QA+. The source code and datasets are available at https:// github.com/tongxiao2002/Dual Geo Solver
Dataset Splits No The paper mentions 'training' but does not explicitly provide information on train/validation/test dataset splits, such as percentages, sample counts, or references to predefined splits.
Hardware Specification Yes All experiments were conducted on an NVIDIA A6000 GPU, with Py Torch version 1.13.1.
Software Dependencies Yes All experiments were conducted on an NVIDIA A6000 GPU, with Py Torch version 1.13.1.
Experiment Setup Yes During training, we keep the parameter of diagram encoder unchanged, and we set the learning rate of Ro BERTa to 2e 5, the learning rate of multimodal fusion module and Goal Generation Module (GGM) to 1e 5, and the learning rate of other modules to 1e 3. We use Adam as the optimizer and set the batch size as 32 while training. The total training epochs is set to 100.