An Autoencoder-Like Nonnegative Matrix Co-Factorization for Improved Student Cognitive Modeling

Authors: Shenbao Yu, Yinghui Pan, Yifeng Zeng, Prashant Doshi, Guoquan Liu, Kim-Leng Poh, Mingwei Lin

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on several real-world data sets demonstrate the efficacy of our approach in terms of both performance prediction accuracy and knowledge estimation ability, when compared with existing student cognitive models.
Researcher Affiliation Academia 1 College of Computer and Cyber Security, Fujian Normal University, China 2 National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, China 3 Department of Computer and Information Sciences, Northumbria University, UK 4 Intelligent Thought and Action Lab, School of Computing, University of Georgia, USA 5 Financial Technology Research Institute, Fudan University, China 6 College of Design and Engineering, National University of Singapore, Singapore
Pseudocode Yes Algorithm 1 PG-BCD+Lipschitz
Open Source Code Yes Our code is available at https://github.com/Shenbao Yu/AE-NMCF.
Open Datasets Yes We use real-world students response data with different sparsities and knowledge-exercise relations, which are from diversified academic subjects, including (a) Math (Frc Sub, Junyi-s, and Quanlang-s), (b) Biology (SLP-Bio-s), (c) History (SLP-His-s), and (d) English (SLP-Eng). Frc Sub comprises of the fraction subtraction problem scores of 536 middle school students [10]. Junyi-s includes problem logs from an e-learning website based on the open-source code released by Khan Academy [34]. The private Quanlang-s data set is collected from mathematical exams given to junior schools supplied by QUANLANG education company. 2 Others include SLP- Bio-s, -His-s, and -Eng, which provide unit test results of K-12 learners compiled by an online learning platform (smart learning partner, SLP) [35].
Dataset Splits Yes For each dataset, we reshape the response logs to the scoring matrix and utilize a 80%/20% train/test split.
Hardware Specification Yes We deploy the competing models using the best publicly available implementation with Python 3.8 on an Ubuntu server with a Core i9-1090K 3.7 GHz and 128 GB memory.
Software Dependencies Yes We deploy the competing models using the best publicly available implementation with Python 3.8 on an Ubuntu server with a Core i9-1090K 3.7 GHz and 128 GB memory.
Experiment Setup Yes For AE-NMCF, we set the number of iterations and the stopping threshold ϵ as 500 and 5 to guarantee convergence. The hyperparameters T and γ are set in Section F.6.