Get a Head Start: On-Demand Pedagogical Policy Selection in Intelligent Tutoring

Authors: Ge Gao, Xi Yang, Min Chi

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on the Probability ITS that has been used in real classrooms for over eight years. Our study shows significant improvement on learning outcomes of students with EDUPLANNER, especially for the ones associated with low-performing subgroups.
Researcher Affiliation Collaboration Ge Gao1, Xi Yang2, Min Chi1 1Department of Computer Science, North Carolina State University 2IBM Research ggao5@ncsu.edu, xi.yang@ibm.com, mchi@ncsu.edu
Pseudocode Yes Algorithm 1: EDUPLANNER.
Open Source Code No The paper states 'For Dual DICE, we use the open-sourced code in its original paper.' This refers to a third-party tool's code, not the code for the methodology described in this paper (EDUPLANNER). There is no explicit statement or link indicating that the authors' own code is open-source.
Open Datasets No The paper describes using '459k historical logs from 1, 148 students across five years' and mentions 'the Probability ITS... has been extensively used by over 2, 000 students with 800k recorded interaction logs through eight academic years.' However, it does not provide a specific link, DOI, repository name, or formal citation (with authors and year) for public access to this dataset.
Dataset Splits Yes In this study, we used 459k historical logs from 1, 148 students across five years for offline policy training and evaluation, and targeted the following semester using EDUPLANNER with the major goal to improve learning outcomes. ... In total, 140 students accomplished all procedures of the study. ... EDUPLANNER identified four subgroups (i.e., K1, K2, K3, K4) using historical data from the Probability ITS. Specifically, K1(Nhis = 345, Ntest = 30) and K2(Nhis = 678, Ntest = 92) are majority groups... K3(Nhis = 101, Ntest = 12) and K4(Nhis = 24, Ntest = 6), contained less samples...
Hardware Specification No The paper does not provide specific details about the hardware used for running its experiments, such as GPU models, CPU types, or specific cloud computing instances.
Software Dependencies No The paper mentions using specific tools like Dual DICE and MAGIC, but does not provide version numbers for any software dependencies, such as programming languages or libraries used in its implementation (e.g., 'For Dual DICE, we use the open-sourced code in its original paper.' and 'For MAGIC, we use the implementation of (Voloshin et al. 2021).').
Experiment Setup No The paper describes the action space size (3) and the number of repetitions for RRS (20 times), but it does not provide specific hyperparameters for model training, such as learning rates, batch sizes, number of epochs, or optimizer settings for the neural networks mentioned (e.g., in FQE).