Template-Based Math Word Problem Solvers with Recursive Neural Networks
Authors: Lei Wang, Dongxiang Zhang, Jipeng Zhang, Xing Xu, Lianli Gao, Bing Tian Dai, Heng Tao Shen7144-7151
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results clearly establish the superiority of our new framework as we improve the accuracy by a wide margin in two of the largest datasets, i.e., from 58.1% to 66.9% in Math23K and from 62.8% to 66.8% in MAWPS. We conduct experiments on two of the largest datasets for arithmetic word problems, in which Math23K contains 23, 164 math problems and MAWPS contains 2, 373 problems. |
| Researcher Affiliation | Academia | Lei Wang,1 Dongxiang Zhang,1,2 Jipeng Zhang,1 Xing Xu,1,2 Lianli Gao,1 Bing Tian Dai,3 Heng Tao Shen1 1Center for Future Media and School of Computer Science & Engineering, UESTC 2Afanti Research 3School of Information Systems, Singapore Management University {demolwang,zhangjipeng20}@std.uestc.edu.cn, {zhangdo,xing.xu,lianli.gao}@uestc.edu.cn btdai@smu.edu.sg, shenhengtao@hotmail.com |
| Pseudocode | No | The paper describes the model architecture and processes but does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | 5. We release the source code of our model in Github1. 1https://github.com/uestc-db/T-RNN |
| Open Datasets | Yes | MAWPS (Koncel-Kedziorski et al. 2016) is another testbed for arithmetic word problems with one unknown variable in the question... combines the published word problem datasets used in (Hosseini et al. 2014; Kushman et al. 2014; Koncel-Kedziorski et al. 2015; Roy and Roth 2015). Math23K (Wang, Liu, and Shi 2017). The dataset contains Chinese math word problems for elementary school students and is crawled from multiple online education websites. |
| Dataset Splits | Yes | Since Math23K has split the problems into training and test datasets when it was published, we simply follow its original setup. For MAWPS, we use 5-fold cross validation. |
| Hardware Specification | Yes | All the experiments were conducted on the same server, with 4 CPU cores (Intel Xeon CPU E5-2650 with 2.30GHz) and 32GB memory. |
| Software Dependencies | No | The paper describes the neural network architectures (e.g., Bi-LSTM, LSTM) and optimizers (Adam, SGD) used, along with their parameters (e.g., learning rate, hidden units), but it does not specify the software libraries or frameworks (like TensorFlow, PyTorch) with version numbers that were used for implementation. |
| Experiment Setup | Yes | In the template prediction module, we use a pre-trained word embedding with 128 units, a two-layer Bi-LSTM with 256 hidden units as encoder, a two-layer LSTM with 512 hidden units as decoder. As to the optimizer, we use Adam with learning rate set to 1e 3, β1 = 0.9 and β2 = 0.99. In the answer generation module, we use a embedding layer with 100 units, a two-layer Bi-LSTM with 160 hidden units. SGD with learning rate 0.01 and momentum factor 0.9 is used to optimize this module. In both components, the number of epochs, mini-batch size and dropout rate are set 100, 32 and 0.5 respectively. |