A Goal-Driven Tree-Structured Neural Model for Math Word Problems
Authors: Zhipeng Xie, Shichao Sun
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the dataset Math23K have shown that our tree-structured model outperforms significantly several state-of-the-art models. |
| Researcher Affiliation | Academia | Shanghai Key Laboratory of Data Science, Fudan University School of Computer Science, Fudan University {xiezp, scsun17}@fudan.edu.cn |
| Pseudocode | No | The paper describes its methods using mathematical equations and textual explanations, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about making the source code available or provide a link to a code repository for the methodology described. |
| Open Datasets | Yes | The dataset Math23K contains 23,161 math word problems annotated with solution expressions and answers. To the best of our knowledge, Math23K is the largest among the datasets of math word problems, such as Alg514 [Kushman et al., 2014] with 514 problems and All Arith [Roy and Roth, 2017] with 831 problems. Available from http://ai.tencent.com/ailab/Deep Neural Solver for Math Word Problems.html |
| Dataset Splits | Yes | The answer accuracies are evaluated on the Math23K dataset via 5-fold cross-validation. The test set contains 4,632 randomly sampled instances (20% of the whole dataset). The GTS model, together with the Seq2Seq baseline, is trained on different numbers of the remaining training instances ranging from 3,000 to 18,000. We use the same 5-fold cross-validation as in [Roy and Roth, 2017]. |
| Hardware Specification | Yes | Our model is implemented using Py Torch2 on a Ubuntu system with GTX1080Ti. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'Ubuntu system' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | The dimensionality of word embedding layer is set to 128, and the dimensionalities of all hidden states for the other layers are set to 512. Our model is trained for 80 epochs by Adam optimization algorithm [Kingma and Ba, 2014] where the mini-batch size is set to 64. The initial value of learning rate is set to 0.001, and the learning rate will be halved every 20 epochs. In addition, we set the dropout probability [Hinton et al., 2012] as 0.5 and weight decay as 1e-5 to prevent overfitting. Last but not least, we set the beam size to 5 in beam search to generate expression trees. |