Learning by Fixing: Solving Math Word Problems with Weak Supervision

Authors: Yining Hong, Qing Li, Daniel Ciao, Siyuan Huang, Song-Chun Zhu4959-4967

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the Math23K dataset show the proposed LBF framework significantly outperforms reinforcement learning baselines in weakly-supervised learning. Furthermore, it achieves comparable top-1 and much better top-3/5 answer accuracies than fully-supervised methods, demonstrating its strength in producing diverse solutions.
Researcher Affiliation Academia Yining Hong, Qing Li, Daniel Ciao, Siyuan Haung, Song-Chun Zhu University of California, Los Angeles, USA. yininghong@cs.ucla.edu, {liqing, danielciao, huangsiyuan}@ucla.edu, sczhu@stat.ucla.edu
Pseudocode Yes Algorithm 1 Fixing Mechanism Algorithm 2 Learning-by-Fixing
Open Source Code No The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets Yes We evaluate our proposed method on the Math23K dataset (Wang, Liu, and Shi 2017). It contains 23,161 math word problems annotated with solution expressions and answers.
Dataset Splits No We do cross-validation following the setting of Xie and Sun (2019). The paper mentions 'cross-validation' but does not provide specific details on the dataset splits (e.g., percentages, sample counts, or number of folds) or provide specific files for the splits. It defers to a prior work for the setting.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies No The paper describes algorithms and model architectures (e.g., Seq2seq, GTS, Bi-LSTM, GRU) but does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow).
Experiment Setup Yes Denote the size of a solution tree Size(T) as the number of leaf nodes including quantities, constants, and operators. The prior range of Size(T) given the length of the numeric value list len(V num) is defined as: Size(T) [min Size(T), max Size(T)] min Size(T) = aminlen(V num) + bmin max Size(T) = amaxlen(V num) + bmax where amin, bmin, amax, bmax are the hyperparameters. The best range for the solution tree size is [2n 1, 2n + 3], where n = len(V num).