Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units
Authors: Ankur Mali, Alexander G. Ororbia, Daniel Kifer, C. Lee Giles5006-5015
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our proposed higher-order, memory-augmented recursive-NN models on two challenging mathematical equation tasks, showing improved extrapolation, stable performance, and faster convergence. Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top5 average accuracy for equation completion. |
| Researcher Affiliation | Academia | Ankur Mali1, Alexander G. Ororbia2, Dan Kifer 1 and C. Lee Giles1 1 The Pennsylvania State University, University Park, PA, 16802, USA 2 Rochester Institute of Technology, Rochester, NY, 14623, USA |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an unambiguous statement or a direct link to open-source code for the methodology described. |
| Open Datasets | Yes | For both tasks investigated, we generated 41, 894 equations of various depths. To create the training/validation/testing splits for this problem, we generate new mathematical identities by performing local random changes to known identities, starting with the 140 axioms provided by (Arabshahi, Singh, and Anandkumar 2018). These changes resulted in identities of similar or higher complexity (equal or larger depth), which may be correct or incorrect and are valid expressions within the grammar (CFGs). Models were trained on equations of depths 1 through 7 and then tested on equations of depths 8 through 13. The data creation process was identical to the one proposed in (Arabshahi et al. 2019). |
| Dataset Splits | Yes | To create the training/validation/testing splits for this problem, we generate new mathematical identities by performing local random changes to known identities, starting with the 140 axioms provided by (Arabshahi, Singh, and Anandkumar 2018). ... Models were trained on equations of depths 1 through 7 and then tested on equations of depths 8 through 13. ... Table 1 provides the statistics of the generated samples, showing the number of equations available at each parse tree depth. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | The paper mentions implementation using 'Py Torch Python framework' but does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | Models were optimized using back-propagation of errors to calculate parameter gradients and were updated using the Adam (Kingma and Ba 2014) adaptive learning rate, with β1 = 0.9, β2 = 0.999, and by starting its global learning rate at λ = 0.1 and then employing a patience scheduling that divided this rate by half whenever there was no improvement observed on the validation set. We regularized the models with a weight decay of 0.00002. The number of neurons in each model s hidden layer as well as the drop-out rate were tuned using a coarse grid search, i.e., hidden layer size was searched over the array [8, 15, 25, 30, 40, 45, 50, 55, 60, 80, 100] and the dropout rate was searched over the array [0.1, 0.2, 0.3]. Parameter gradients were estimated over mini-batches of size 50 for all experiments. We ran all models using 10 different seeds and report the 10-trial average and standard deviation of the results. Models were trained for a maximum of 500 epochs or until convergence was reached, i.e., early stopping was used. |