Dealbreaker: A Nonlinear Latent Variable Model for Educational Data
Authors: Andrew Lan, Tom Goldstein, Richard Baraniuk, Christoph Studer
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that the dealbreaker model achieves comparable or better prediction performance as compared to affine models with real-world educational datasets. We now demonstrate the prediction performance of the dealbreaker model on unobserved student responses using four real-world educational datasets. We furthermore showcase the interpretability of the dealbreaker model by visualizing the dealbreaker concept for each question. |
| Researcher Affiliation | Academia | Andrew Lan SL29@RICE.EDU Rice University, Tom Goldstein TOMG@CS.UMD.EDU University of Maryland, Richard Baraniuk RICHB@RICE.EDU Rice University, Christoph Studer STUDER@CORNELL.EDU Cornell University |
| Pseudocode | No | No clearly labeled pseudocode or algorithm blocks were found. The paper describes algorithmic steps and mathematical formulations for inference, but not in a pseudocode format. |
| Open Source Code | No | The paper does not provide an explicit statement or link confirming the release of its own source code for the described methodology. |
| Open Datasets | Yes | MT: N = 99 students answering Q = 34 questions in a high-school algebra test administered in Amazon s Mechanical Turk (Amazon, 2016)... UG: N = 92 students answering Q = 203 questions... CE: N = 1567 students answering Q = 60 questions... ed X: N = 6403 students answering Q = 197 questions... Movie Lens 100k dataset (Herlocker et al., 1999) |
| Dataset Splits | No | To reduce the identifiability issue of the dealbreaker model, we add the regularization term λ 2 (P k,j C2 k,j + P i,k µ2 i,k) to the cost functions of both the hard and soft dealbreaker optimization problems and select the parameter λ using cross-validation. In each crossvalidation run, we randomly leave out 20% of the student responses in the dataset (the unobserved data) and train the algorithms on the rest of the responses before testing their prediction performance on the unobserved data. |
| Hardware Specification | Yes | For example, a single run of our Python code for the soft dealbreaker model with the UG dataset with 92 students and 203 questions takes only 10 s compared to 30 s for the hard dealbreaker model on an Intel i7 laptop with a 2.8 GHz CPU and 8 GB memory. |
| Software Dependencies | No | The paper mentions 'our Python code' and that 'For the Rasch model and the MIRT model, we perform inference using the R MIRT package (Chalmers, 2012).' However, specific version numbers for Python or the R MIRT package itself are not provided. |
| Experiment Setup | Yes | To reduce the identifiability issue of the dealbreaker model, we add the regularization term λ 2 (P k,j C2 k,j + P i,k µ2 i,k) to the cost functions of both the hard and soft dealbreaker optimization problems and select the parameter λ using cross-validation. For the MIRT model, the DINA model, and both dealbreaker models, we use K {3, 6} concepts. We randomly initialize the variables Zk i,j, Ck,j, and µi,k, i, j, k from the standard normal distribution, and initialize the Lagrange multipliers as Λk i,j = 0, i, j, k. |