Automatic Discovery of Cognitive Skills to Improve the Prediction of Student Learning
Authors: Robert Lindsey, Mohammad Khajah, Michael Mozer
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our technique on datasets from five different intelligent tutoring systems designed for students ranging in age from middle school through college. We obtain two surprising results. First, in three of the five datasets, the skills inferred by our technique support significantly improved predictions of student performance over the expert-provided skills. |
| Researcher Affiliation | Academia | Robert V. Lindsey, Mohammad Khajah, Michael C. Mozer Department of Computer Science and Institute of Cognitive Science University of Colorado, Boulder |
| Pseudocode | No | We used Algorithm 8 from [16], which effectively produces a Monte Carlo approximation to the intractable marginal data likelihood, integrating out over the BKT parameters that could be drawn for the new table. |
| Open Source Code | No | No explicit statement or link providing open-source code for the methodology described in this paper. |
| Open Datasets | Yes | We ran simulations on five student performance datasets (Table 1). The datasets varied in the number of students, exercises, and expert skill labels; the students in the datasets ranged in age from middle school to college. Each dataset consists of student identifiers, exercise identifiers, trial numbers, and binary indicators of response correctness from students undergoing variable-length sequences of exercises over time.2 For the Data Shop datasets, exercises were identified by concatenating what they call the problem hierarchy, problem name, and the step name columns. ... PSLC Data Shop [12]... [15] Spanish vocabulary |
| Dataset Splits | Yes | For each model, we ran ten replications of five-fold cross validation on each dataset. In each replication-fold, we collected posterior samples using our MCMC algorithm given the data recorded for students in four of the five subsets. We then used the samples to predict the response sequences (correct vs. incorrect) of the remaining students. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) are provided. |
| Software Dependencies | No | The paper mentions software components like |
| Experiment Setup | Yes | For all simulations, we run the sampler for 200 iterations and discard the first 100 as the burn-in period. The seating arrangement is initialized to the expert-provided skills; all other parameters are initialized by sampling from the generative model. ... we apply five axis-aligned slice sampling updates to each table’s BKT parameters and to the hyperparameters β and α [17]. ... For each skill, we generated sequences of student correct/incorrect responses via BKT, with parameters sampled from plausible distributions: λL Uniform(0, 1), λM Beta(10, 30), λG Beta(1, 9), and λS Beta(1, 9). |