Generalizing Math Word Problem Solvers via Solution Diversification
Authors: Zhenwen Liang, Jipeng Zhang, Lei Wang, Yan Wang, Jie Shao, Xiangliang Zhang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on a benchmark dataset Math23k and a new dataset named Weak12k, and show that our framework improves the performance of various MWP solvers under different settings by generating correct and diverse solutions. |
| Researcher Affiliation | Collaboration | 1 University of Notre Dame 2 Hong Kong University of Science and Technology 3 Singapore Management University 4 Tencent AI Lab 5 University of Electronic Science and Technology of China |
| Pseudocode | Yes | Algorithm 1: Weak Data Augmentation; Alg. 2. |
| Open Source Code | Yes | The code and data can be found in 1. 1https://github.com/LZhenwen/Solution Diversity |
| Open Datasets | Yes | We curate and release a novel math word problem (MWP) dataset called Weak12k with 12,117 MWPs. This dataset will be released to the public upon paper acceptance to facilitate future studies like semi-weakly supervised solver development. |
| Dataset Splits | Yes | We report the performance of 5-fold cross-validation on it following (Xie and Sun 2019) and (Hong et al. 2021a). |
| Hardware Specification | Yes | We use Pytorch to construct the code and the NVIDIA RTX 2080Ti graphic card to train the solvers. |
| Software Dependencies | No | The paper states 'We use Pytorch to construct the code' but does not provide specific version numbers for Pytorch or any other software dependencies. |
| Experiment Setup | Yes | The dimension of the embedding matrix is 128, and the dimension of all hidden features is 512. We train the model 200 epochs with the Adam optimizer (Kingma and Ba 2014) and the learning rate 0.001, which will be halved every 30 epochs. For the first 100 epochs use ai = si and the remaining epochs use (si + twsi)/2. We update the solution buffer every 5 epochs of parameter learning, to leave sufficient time to train the model solution buffer updates. |