reproducibilityindex.ai

Deep Attentive Model for Knowledge Tracing

Authors: Xinping Wang, Liangyu Chen, Min Zhang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through experimental results on three real-world datasets, DAKTN significantly outperforms state-of-the-art baseline models. We also present the reasonableness of DAKTN by ablation testing.
Researcher Affiliation	Academia	Xinping Wang, Liangyu Chen*, Min Zhang East China Normal University lychen@sei.ecnu.edu.cn
Pseudocode	No	The paper describes the model architecture and its components using mathematical formulations and diagrams, but it does not include any explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statements about releasing open-source code or include links to a code repository.
Open Datasets	Yes	In our experiments, we use the following three real-world datasets. The basic statistics of the datasets are summarized in Table 1. ASSIST09 and ASSIST12 are both collected by the ASSISTments online tutoring system [Feng, Heffernan, and Koedinger 2009]... Ed Net [Choi et al. 2020] is an open dataset collected by a multi-platform AI tutoring service Santa. We use Ed Net KT1 dataset in our experiments...
Dataset Splits	Yes	Then each dataset is divided into three parts, namely, 70%, 10% and 20% for training, validating and testing respectively.
Hardware Specification	Yes	All models are implemented by Tensorflow 2.3 using Python 3.7, and all experiments are executed on a Cent OS Linux server with the main configuration of GPU RTX 2080Ti, CPU@3.30GHz, 8GB RAM, and 1TB SSD Disk.
Software Dependencies	Yes	All models are implemented by Tensorflow 2.3 using Python 3.7...
Experiment Setup	Yes	In the step of data preprocessing, we first filter students who have answered fewer than 5 exercises to ensure each student with enough learning data. Then each dataset is divided into three parts, namely, 70%, 10% and 20% for training, validating and testing respectively. As to the implementation parameters, we initialize the parameters with batch normalization, and use the Adam algorithm to optimize our model. The dimensions of the positive full connection layers in Equation (4) are 256, 256, 1 respectively. We set the learning rate as 0.001, and use dropout with p = 0.4 to alleviate overfitting.