Predictive, scalable and interpretable knowledge tracing on structured domains
Authors: Hanqi Zhou, Robert Bamler, Charley M Wu, Álvaro Tejero-Cantero
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluated on three datasets from online learning platforms, PSI-KT achieves superior multi-step predictive accuracy and scalable inference in continual-learning settings, all while providing interpretable representations of learner-specific traits and the prerequisite structure of knowledge that causally supports learning. |
| Researcher Affiliation | Academia | Hanqi Zhou1,2,4, Robert Bamler1,3, Charley M. Wu1,2,3 , & Álvaro Tejero-Cantero1,2 1University of Tübingen, 2Cluster of Excellence Machine Learning, 3Tübingen AI Center, 4IMPRS-IS {hanqi.zhou,robert.bamler,charley.wu,alvaro.tejero}@uni-tuebingen.de |
| Pseudocode | No | The paper provides detailed mathematical formulations and equations for its model and inference method (e.g., Eqs. 1-10), along with a graphical overview (Fig. 7 in Appendix A.4). However, it does not include a distinct block or figure explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Code at github.com/mlcolab/psi-kt |
| Open Datasets | Yes | We use Assistments 2012 and 2017 datasets2 (Assist12 and Assist17) and Junyi s 2015 dataset3 (Junyi15; Chang et al., 2015), which in addition to interaction data, provides human-annotated KC relations (see Table 1 and Appendix A.3.2 for details). 2https://sites.google.com/site/assistmentsdata 3https://pslcdatashop.web.cmu.edu/Dataset Info?dataset Id=1198 |
| Dataset Splits | Yes | In our evaluations, we mainly focus on prediction and generalization when training on 10 interactions from up to 1000 learners. The between-learner generalization accuracy of the models above, when tested on 100 out-of-sample learners, is shown in Table 2, where fine-tuning indicates that parameters were updated using (10-point) learning histories from the unseen learners. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running experiments, such as GPU models, CPU specifications, or memory configurations. It mentions that the model 'scales well' but provides no technical specifications for the computational environment. |
| Software Dependencies | No | The paper discusses the use of 'deep learning methods', 'recurrent neural networks', 'LSTM networks', and 'graph neural networks'. However, it does not specify any particular software versions for frameworks (e.g., PyTorch, TensorFlow), programming languages (e.g., Python), or libraries used in the implementation of the model or experiments. |
| Experiment Setup | Yes | In our evaluations, we mainly focus on prediction and generalization when training on 10 interactions from up to 1000 learners. Good KT performance with little data is key in practical ITS to minimize the number of learners on an experimental treatment (principle of equipoise, similar to medical research; Burkholder, 2021), to mitigate the cold-start problem, and to extend the usefulness of the model to classroom-size groups. To provide ITS with a basis for adaptive guidance and long-term learner assessment, we always predict the 10 next interactions. ... Each model is initially trained on 10 interactions from 100 learners. We then incrementally provide one data point from each learner, and evaluate the training costs and prediction accuracy. |