reproducibilityindex.ai

Handwritten Mathematical Expression Recognition via Attention Aggregation Based Bi-directional Mutual Learning

Authors: Xiaohang Bian, Bo Qin, Xiaozhe Xin, Jianwu Li, Xuefeng Su, Yanfeng Wang113-121

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our proposed approach achieves the recognition accuracy of 56.85 % on CROHME 2014, 52.92 % on CROHME 2016, and 53.96 % on CROHME 2019 without data augmentation and model ensembling, substantially outperforming the state-of-the-art methods.
Researcher Affiliation	Collaboration	1 School of Computer Science and Technology, Beijing Institute of Technology, China 2 AI Interaction Department, Tencent, China
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The source code is available in https://github.com/XH-B/ABM.
Open Datasets	Yes	We train our models based on the CROHME 2014 competition dataset with 111 classes of mathematical symbols and 8836 handwritten mathematical expressions
Dataset Splits	No	The paper mentions using a validation process for early stopping ("training will stop early when the learning rate drops 10 times") but does not provide specific details about the validation dataset split (e.g., percentages, sample counts, or how it was derived from the training data).
Hardware Specification	Yes	All the models are trained/tested on a single NVIDIA V100 16GB GPU.
Software Dependencies	No	The paper mentions using the Adadelta optimizer but does not specify version numbers for any software dependencies like programming languages, frameworks (e.g., PyTorch, TensorFlow), or libraries.
Experiment Setup	Yes	Our proposed method is optimized with Adadelta optimizer, and its learning rate starts from 1, decaying two times smaller when the WER does not decrease within 15 epochs. And the training will stop early when the learning rate drops 10 times. We set the batch size as 16. For the decoder, we set n = 256, d=512, D=684 and K=113 (adding sos and eos on 111 labels). In the loss function, λ is set to 0.5.