reproducibilityindex.ai

Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing

Authors: Jill-Jenn Vie, Hisashi Kashima750-757

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show, using several real datasets of tens of thousands of users and items, that FMs can estimate student knowledge accurately and fast even when student data is sparsely observed, and handle side information such as multiple knowledge components and number of attempts at item or skill level. Our experiments show, in particular, that: It is better to estimate a bias for each item (not only skill), which popular educational data mining (EDM) models do not. Most existing models in EDM cannot handle side information such as multiple skills for one item, but the proposed approach does. Side information improves performance more than increasing the latent dimension.
Researcher Affiliation	Academia	Jill-Jˆenn Vie,1 Hisashi Kashima1 2 1RIKEN Center for Advanced Intelligence Project, Tokyo, 2Kyoto University
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	For the sake of reproducibility, our implementation is available on Git Hub1. The interested reader can check our code and reuse it in order to try new combinations and devise new models. 1https://github.com/jilljenn/ktm
Open Datasets	Yes	All datasets except Castor can be found in the R package CDM (George et al. 2016). Assistments The 2009 2010 dataset of Assistments described in (Feng, Heffernan, and Koedinger 2009).
Dataset Splits	Yes	For each dataset, we perform 5-fold cross validation. For each fold, entries are separated into a train and test set, then we train different encodings of KTMs using the train set, notably the ones corresponding to existing models, and predict the outcomes in the test set.
Hardware Specification	No	The paper states training time '4 min 30 seconds on CPU' but does not provide specific hardware details such as CPU or GPU models, or cloud instance types.
Software Dependencies	No	The paper mentions 'lib FM' and 'pyw FM Python wrapper' but does not specify their version numbers or the versions of C++ or Python used.
Experiment Setup	Yes	KTMs are trained during 1000 epochs for each nontemporal dataset, 500 epochs for the Assistments dataset and 300 epochs for the Berkeley dataset, because it was enough for convergence. At each epoch, we average the results over all 5 folds, in terms of accuracy (ACC), area under the curve (AUC) and negative log-likelihood (NLL). Like Rendle (2012), we assume some priors over the model parameters in order to guide training and avoid overfitting. Each bias wk follows wk N(µ, 1/λ) and each embedding component vkf, f = 1, . . . , d also follows vkf N(µ, 1/λ) where µ and λ are regularization parameters that follow hyperpriors µ N(0, 1) and λ Γ(1, 1).