reproducibilityindex.ai

Learning by Fixing: Solving Math Word Problems with Weak Supervision

Authors: Yining Hong, Qing Li, Daniel Ciao, Siyuan Huang, Song-Chun Zhu4959-4967

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on the Math23K dataset show the proposed LBF framework signiﬁcantly outperforms reinforcement learning baselines in weakly-supervised learning. Furthermore, it achieves comparable top-1 and much better top-3/5 answer accuracies than fully-supervised methods, demonstrating its strength in producing diverse solutions.
Researcher Affiliation	Academia	Yining Hong, Qing Li, Daniel Ciao, Siyuan Haung, Song-Chun Zhu University of California, Los Angeles, USA. yininghong@cs.ucla.edu, {liqing, danielciao, huangsiyuan}@ucla.edu, sczhu@stat.ucla.edu
Pseudocode	Yes	Algorithm 1 Fixing Mechanism Algorithm 2 Learning-by-Fixing
Open Source Code	No	The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets	Yes	We evaluate our proposed method on the Math23K dataset (Wang, Liu, and Shi 2017). It contains 23,161 math word problems annotated with solution expressions and answers.
Dataset Splits	No	We do cross-validation following the setting of Xie and Sun (2019). The paper mentions 'cross-validation' but does not provide specific details on the dataset splits (e.g., percentages, sample counts, or number of folds) or provide specific files for the splits. It defers to a prior work for the setting.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper describes algorithms and model architectures (e.g., Seq2seq, GTS, Bi-LSTM, GRU) but does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow).
Experiment Setup	Yes	Denote the size of a solution tree Size(T) as the number of leaf nodes including quantities, constants, and operators. The prior range of Size(T) given the length of the numeric value list len(V num) is deﬁned as: Size(T) [min Size(T), max Size(T)] min Size(T) = aminlen(V num) + bmin max Size(T) = amaxlen(V num) + bmax where amin, bmin, amax, bmax are the hyperparameters. The best range for the solution tree size is [2n 1, 2n + 3], where n = len(V num).