Personalized Mathematical Word Problem Generation
Authors: Oleksandr Polozov, Eleanor O'Rourke, Adam M. Smith, Luke Zettlemoyer, Sumit Gulwani, Zoran Popović
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We report an evaluation of generated problems by comparing human judgements with textbook problems ( 6). Our problems have slightly more artificial language, but they are generally comprehensible, and as solvable as the textbook problems. User Study We prepared an ontology of 100-200 types, relations, and tropes in three literary settings: Fantasy, Science Fiction, School of Wizardry. This one-time initial setup of the system took about 1-2 person-months. From it, we randomly generated 25 problems in the domains of age, counting, and trading, with the solutions requiring 2-4 primitive arithmetic operations. We sampled the problems with sufficient linguistic variability to evaluate the overall text quality. Although the ASP solving has exponential complexity, every problem was generated in less than 60 s, which is a feasible time limit for realistic problems within our range of interests. We selected 25 textbook problems from the Singapore Math curriculum [Publications, 2009] with the equivalent distribution of complexity (solution lengths), and conducted two studies using Mechanical Turk. Study A assessed language aspects of the problems. It asked the subjects 4 questions (shown in Figure 3) on a forced-choice Likert scale. Study B assessed mathematical applicability of the problems. It asked the subjects to solve a given problem, and measured solving time and correctness. For both studies, each problem was presented to 20 different native English speakers (1000 total). |
| Researcher Affiliation | Collaboration | Oleksandr Polozov University of Washington polozov@cs.washington.edu Eleanor O Rourke University of Washington eorourke@cs.washington.edu Adam M. Smith University of Washington amsmith@cs.washington.edu Luke Zettlemoyer University of Washington lsz@cs.washington.edu Sumit Gulwani Microsoft Research Redmond sumitg@microsoft.com Zoran Popovi c University of Washington zoran@cs.washington.edu |
| Pseudocode | No | The paper includes code snippets to illustrate concepts (e.g., ASP syntax for requirements and logic generation), but it does not provide structured pseudocode blocks or formally labeled algorithms. |
| Open Source Code | No | The paper does not provide an explicit statement or a link indicating that the source code for their methodology is publicly available. |
| Open Datasets | Yes | We selected 25 textbook problems from the Singapore Math curriculum [Publications, 2009] with the equivalent distribution of complexity (solution lengths)... |
| Dataset Splits | No | The paper describes the setup of a user study to evaluate the generated problems, but it does not specify dataset splits (e.g., training, validation, test percentages or counts) for a machine learning model's training or validation process. |
| Hardware Specification | No | The paper mentions that "every problem was generated in less than 60 s", but it does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments or generating problems. |
| Software Dependencies | No | The paper mentions using "answer-set programming (ASP)" and "state-of-the-art ASP solvers", but it does not provide specific version numbers for these or any other software dependencies needed for reproducibility. |
| Experiment Setup | No | The paper describes the setup of the user study (e.g., number of problems, subjects, Likert scale), but it does not provide specific experimental setup details such as hyperparameters, training configurations, or system-level settings for the generative model itself (e.g., learning rates, batch sizes, optimizer details if applicable). |