reproducibilityindex.ai

A Bottom-Up DAG Structure Extraction Model for Math Word Problems

Authors: Yixuan Cao, Feng Hong, Hongwei Li, Ping Luo39-46

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on DRAW1K and Math23K datasets demonstrate that our model outperforms state-of-the-art deep learning methods. We also conduct detailed analysis on the results to show the strengths and limitations of our approach.
Researcher Affiliation	Academia	1Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China. 2University of Chinese Academy of Sciences, Beijing 100049, China. 3Peng Cheng Laboratory, Shenzhen, China. {caoyixuan, hongfeng18g, lihongwei, luop}@ict.ac.cn
Pseudocode	Yes	The detailed process is shown in Algorithm 1.
Open Source Code	No	The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for the source code of the methodology described.
Open Datasets	Yes	We conduct experiments on two datasets, DRAW1K and Math23K: DRAW1K (Upadhyay and Chang 2017) contains 1,000 algebra word problems... Math23K (Wang, Liu, and Shi 2017) is the largest published MWP dataset with 23,161 problems.
Dataset Splits	Yes	The result is reported using 5-fold cross-validation. We split the data following Xie and Sun (2019) for Math23K, and Upadhyay and Chang (2017) for DRAW1K.
Hardware Specification	Yes	All the experiments are conducted on a NIVIDIA 1080Ti GPU.
Software Dependencies	Yes	Our model is implemented based on Pytorch (Paszke et al. 2019). ... We use the pre-trained BERT model (Devlin et al. 2018) as the embedding layer.
Experiment Setup	Yes	The dimensions of other components are set to 512. We update the last 2 layers of BERT during training, using the Adam optimizer suggested in Devlin et al. (2018) with learning rate 5e-5. The batch size is set to 64. We set the dropout rate to 0.5 to prevent overﬁtting.