Challenges in the Automatic Analysis of Students’ Diagnostic Reasoning
Authors: Claudia Schulz, Christian M. Meyer, Iryna Gurevych6974-6981
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We create the first corpus for this task, comprising diagnostic reasoning selfexplanations of students from two domains annotated with epistemic activities. Based on insights from the corpus creation and the task s characteristics, we discuss three challenges for the automatic identification of epistemic activities using AI methods: the correct identification of epistemic activity spans, the reliable distinction of similar epistemic activities, and the detection of overlapping epistemic activities. We propose a separate performance metric for each challenge and thus provide an evaluation framework for future research. Indeed, our evaluation of various state-of-the-art recurrent neural network architectures reveals that current techniques fail to address some of these challenges. |
| Researcher Affiliation | Collaboration | Claudia Schulz,* Christian M. Meyer, Iryna Gurevych Ubiquitous Knowledge Processing (UKP) Lab Technische Universit at Darmstadt, Germany ... *New affiliation: Babylon Health, London, UK |
| Pseudocode | No | The paper describes the methods textually but does not include any figures, blocks, or sections explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | 1Code and data available at https://github.com/UKPLab/ aaai19-diagnostic-reasoning |
| Open Datasets | Yes | We create the first corpus for this task, comprising diagnostic reasoning selfexplanations of students from two domains annotated with epistemic activities. ... 1Code and data available at https://github.com/UKPLab/ aaai19-diagnostic-reasoning ... We thus make our corpus and the corresponding analysis software publicly available. |
| Dataset Splits | Yes | We split our data into 60% train, 20% dev, and 20% test sets, using the same proportion of case scenarios in all splits. |
| Hardware Specification | No | The paper describes the experimental setup but does not specify any particular GPU models, CPU models, or other hardware components used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Reimers and Gurevych s (2017) implementation of a bidirectional long short-term memory (Bi LSTM) network with a conditional random field (CRF) output layer' and 'German fast Text word embeddings (Grave et al. 2018)' but does not provide specific version numbers for these software components or the programming language used. |
| Experiment Setup | Yes | We perform ten runs for each architecture, applying the following parameters for all of them: one hidden layer of 100 units, variational dropout rates for input and hidden layer of 0.25, and the nadam optimizer. |