Challenges in the Automatic Analysis of Students’ Diagnostic Reasoning

Authors: Claudia Schulz, Christian M. Meyer, Iryna Gurevych6974-6981

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We create the first corpus for this task, comprising diagnostic reasoning selfexplanations of students from two domains annotated with epistemic activities. Based on insights from the corpus creation and the task s characteristics, we discuss three challenges for the automatic identification of epistemic activities using AI methods: the correct identification of epistemic activity spans, the reliable distinction of similar epistemic activities, and the detection of overlapping epistemic activities. We propose a separate performance metric for each challenge and thus provide an evaluation framework for future research. Indeed, our evaluation of various state-of-the-art recurrent neural network architectures reveals that current techniques fail to address some of these challenges.
Researcher Affiliation Collaboration Claudia Schulz,* Christian M. Meyer, Iryna Gurevych Ubiquitous Knowledge Processing (UKP) Lab Technische Universit at Darmstadt, Germany ... *New affiliation: Babylon Health, London, UK
Pseudocode No The paper describes the methods textually but does not include any figures, blocks, or sections explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code Yes 1Code and data available at https://github.com/UKPLab/ aaai19-diagnostic-reasoning
Open Datasets Yes We create the first corpus for this task, comprising diagnostic reasoning selfexplanations of students from two domains annotated with epistemic activities. ... 1Code and data available at https://github.com/UKPLab/ aaai19-diagnostic-reasoning ... We thus make our corpus and the corresponding analysis software publicly available.
Dataset Splits Yes We split our data into 60% train, 20% dev, and 20% test sets, using the same proportion of case scenarios in all splits.
Hardware Specification No The paper describes the experimental setup but does not specify any particular GPU models, CPU models, or other hardware components used for running the experiments.
Software Dependencies No The paper mentions using 'Reimers and Gurevych s (2017) implementation of a bidirectional long short-term memory (Bi LSTM) network with a conditional random field (CRF) output layer' and 'German fast Text word embeddings (Grave et al. 2018)' but does not provide specific version numbers for these software components or the programming language used.
Experiment Setup Yes We perform ten runs for each architecture, applying the following parameters for all of them: one hidden layer of 100 units, variational dropout rates for input and hidden layer of 0.25, and the nadam optimizer.